10 things I didn’t know about Amazon’s Cloudfront

After having migrated my blog to Amazon Web Services I decided to accelerate it using their CDN offering. Overkill? Perhaps. Gratifying? Absolutely!  With almost 20 worldwide PoPs the response times as seen by Pingdom plummeted during my migration last month: Here are 10 things I didn’t know going in: 1. Cloudfront is ...

Most Popular

  • Visualizing how kernel 3.0’s initial congestion window increase is lowering response times

    When the recent IETF internet draft matures to an RFC, it’ll be the first increase in initial window (cwnd / TCP_INIT_CWND)  increase since 2002. The implementation has already made its way into 2.6.39 earlier this year and I thought I’d take 3.0 for a spin and demonstrate the increase in small object acceleration it yields.  I’m testing using a VPS node 100ms RTT away and loading objects ranging from 4kB to 128kB :

    image

    image

    image

    image

    The head start the large congestion window offers favors smaller objects and in the 8kB range, the entire content can be sent in a single round trip:

     

    image

     

    image

  • High speed ffmpeg cluster encoding with Python and avidemux

    When it comes to clustered video codec conversion there are two general scenarios:

    Scenario 1: Encoding many videos across many computers
    Scenario 2: Encoding a single video across computers

    Scenario 1 is ubiquitous and most encoding clusters are likely running at full steam with a backlog of videos waiting in queue. Scenario 2 is less common and useful with deadlines, where concertedly converting a single video across your cluster would reduce time tremendously.

    I searched the google cavern for scenario 2 and didn’t find any existing ffmpeg cluster implementations so I spent my Sunday afternoon writing a python script to do just that.  Now, using the 4 pcs at home I’m converting a single video 300% faster.  So how does it work?  In a sentence, I split the encoding into ffmpeg tasks (using –ss and –t), distribute the tasks to my cluster, and copy the parts into the final version using avidemux (–append and –rebuild-index).   Is it perfect?  Probably far from it.  But as a first draft it worked great.  I tested several sources and formats and the video/audio merged seamlessly and in sync.  The code has no error catching and you may need to massage the code to work in your setup.  I’ll work on a second draft converting to h.264 instead of flv.

    
    #!/usr/bin/python
    # Version 0.1
    # Big todo is adding error catching
    
    import sys
    import os
    from re import search
    from subprocess import PIPE, Popen
    
    #configure the two parameters below
    #1. The name of all the hosts in the cluster that will participate
    hostList = ['one', 'two', 'three', 'four']
    #2. The NFS mounted dir which contains the video you need encoded
    encodeDir = "/net/ffcluster"
    
    #Function definitions
    def getDurationPerJob(totalFrames, fps):
    return totalFrames / float(fps) / len(hostList)
    
    def getFps(file):
    information = Popen(("ffmpeg", "-i", file), stdout=PIPE, stderr=PIPE)
    #fetching tbr (1), but can also get tbn (2) or tbc (3)
    #examples of fps syntax encountered is 30, 30.00, 30k
    fpsSearch = search("(\d+\.?\w*) tbr, (\d+\.?\w*) tbn, (\d+\.?\w*) tbc", information.communicate()[1])
    return fpsSearch.group(1)
    
    def getTotalFrames(file, fps):
    information = Popen(("ffmpeg", "-i", file), stdout=PIPE, stderr=PIPE)
    timecode = search("(\d+):(\d+):(\d+).(\d+)", information.communicate()[1])
    return ((((float(timecode.group(1)) * 60) + float(timecode.group(2))) * 60) + float(timecode.group(3)) + float(timecode.group(4))/100) * float(fps)
    
    def clusterRun(file, fileName, durationPerJob, fps):
    start = 0.0
    end = durationPerJob
    runCount=0
    jobList=[]
    #submits equal conversion portions to each host
    for i in hostList:
    runCount += 1
    runFfmpeg = "ssh %s 'cd %s;ffmpeg -ss %f -t %f -y -i %s %s </dev/null'" % (i, encodeDir, start, end, file, fileName + "_run" + str(runCount) + ".flv")
    start += end + 1/float(fps)
    jobList.append(Popen(runFfmpeg, shell=True))
    #wait for all jobs to complete
    runCount=0
    for i in hostList:
    jobList[runCount].wait()
    runCount += 1
    #append/rebuild final from parts and rebuild index
    avidemuxHead = "avidemux2_cli --autoindex --load %s_run1.flv --append %s_run2.flv " % (fileName, fileName)
    avidemuxTail = "--audio-codec copy --video-codec copy --save %sFinal.flv" % (fileName)
    #add --appends for additional host above the first 2
    for i in range(len(hostList)- 2):
    avidemuxHead = "%s --append %s_run%d.flv " % (avidemuxHead, fileName, i+3)
    runAvidemux = "%s %s" % (avidemuxHead, avidemuxTail)
    Popen(runAvidemux, shell=True)
    
    #Main begin
    sourceFile = sys.argv[1]
    fps = getFps(sourceFile)
    totalFrames = getTotalFrames(sourceFile, fps)
    durationPerJob = getDurationPerJob(totalFrames, fps)
    fileName = os.path.splitext(sourceFile)[0]
    
    clusterRun(sourceFile, fileName, durationPerJob, fps)
    
  • I fixed conntrack-viewer 1.3 for 2.6.18-194.el5

    This neat Perl script for viewing your masqueraded connections via ip_conntrack hadn’t been updated since 2002 and was erroring out with the messages below. Fixing it involved correcting the regexes for the new version of netfilter. Since it’s GPLed I’m including the modified source here: http://www.mediafire.com/?pes7sb66vhgp77j

      Use of uninitialized value in getservbyport at ./conntrack-viewer.pl line 114. Use of uninitialized value in getservbyport at ./conntrack-viewer.pl line 115. Use of uninitialized value in length at ./conntrack-viewer.pl line 128. Use of uninitialized value in length at ./conntrack-viewer.pl line 133. Use of uninitialized value in length at ./conntrack-viewer.pl line 143. Use of uninitialized value in concatenation (.) or string at ./conntrack-viewer.pl line 151. Use of uninitialized value in string ne at ./conntrack-viewer.pl line 154. Use of uninitialized value in subroutine entry at ./conntrack-viewer.pl line 162. Use of uninitialized value in gethostbyaddr at ./conntrack-viewer.pl line 162. Use of uninitialized value in gethostbyaddr at ./conntrack-viewer.pl line 163.     
  • Profiling with Traceview using the Android SDK

    I noticed my seekbars weren’t responsive or fluid so I used the profiler to check what was wrong.  I swiped the seekbar 3 times and you clearly see that SQLLite was the problem:

    image

    I improved the code and the only spike was the initialization of the SQL database which can be seen in both traces:

    image