All Posts in Uncategorized

December 11, 2016
Build your own Anycast Network in 9 steps

Virtually every web page or app you use talks to a unique remote server with a unique IP address. When popular websites or apps have servers around the world, one problem is that complex systems have to be built to make sure you’re talking to the optimal server (usually the closest one to you), otherwise performance can suffer. With anycast, a single IP address is assigned to the global servers and then the glue of the internet, BGP (Border Gateway Protocol), routes you to the “closest” server. The server isn’t necessarily the closest physically to you, but the one “network close” to you. While anycast isn’t new, using it to improve performance for web traffic is a relatively recent trend that some sites like LinkedIn are using.

If you want to see anycast in action, click here to see where my network routes you.

The completed network spans 20 Points of Presence (PoPs) across 6 continents

This guide is intended for techies and sysadmin types who’d like to build a “Hello World” anycast network. If you’re not technical, consider skimming it to get an overview of how Internet plumbing is assembled. Running your own network is not only fun, but also instructive, and will give you a different vantage point on key topical issues like net neutrality, censorship, and IPv6.

The administrative steps will take a couple weeks since you’ll be interacting with other organizations. The technical steps will take a few evenings to get things up and running, and considerably more time if you decide to science the heck out of it. The completed network spans 20 Points of Presence (PoPs) across 6 continents as confirmed with a trial Catchpoint account:

1. Register with a Regional Internet Registry

The Regional Internet Registry (RIR) is the body that’ll assign the building blocks such as your subnet and Autonomous System. The location of your service, your sensitivity to cost, and how much red tape you’re comfortable with, will decide which RIR you pick. I gravitated to RIPE as it had a good mix of accessibility and cost, and 4 of the points of presence (PoPs) I was planning to use were in Europe. Registration is straightforward. Get familiar with the RIPE Database as you can immediately start creating metadata (maintainer, person, organization, etc.) used in the next steps.

Cost: $0

2. Acquire a /24 and /48

The smallest subnets you can advertise over BGP are a /24 for IPv4 and a /48 for IPv6 so you’ll need to get your hands on at least one of them. Acquiring IPv6 space is relatively easy due to its abundance and some organizations will assign you a /44 for free.

IPv4 is optional for this project. Just like many commodities in life, you have two choices: lease or buy. Owning is pricey, and more complex for many other reasons, and even if you receive a /22 from RIPE (As of Dec. 2016 they still have 13.3 million IPv4 addresses to hand out) you’ll be stuck with hefty annual RIR fees. I initially went down this route and could write an entire other blog post about that. For this project and the remainder of this guide I’m going to cover leasing as it has low up front costs and the ability to go month to month. If you decide to use IPv4 you’ll need to shop around; webhostingtalk and lowendbox are your friends. After discussions with several parties it became clear that Prager IT was the route to go as their professionalism and enthusiasm checked all my boxes. Not only do they include a complimentary IPv6 /48 with the lease of IPv4 space, but they were responsive to multiple Letter of Authority (LOA) requests. While the leasing process was an administrative slog, requiring contract work and somewhat elaborate verification, it was straightforward.

Cost: $0 for IPv6 / $55 per month for IPv4

3. Apply for an Autonomous System

The Autonomous System number (ASN) is used to identify your network on routers. This step is technically optional. You can announce your IP space without a public ASN by using a private ASN with a transit provider that already has a public ASN. However, without a public ASN you won’t be able to peer or use advanced traffic management like prepending and communities. If you don’t want an ASN jump to step 4.

With RIPE, there is no fee for an ASN as long as you get sponsored by an existing member of RIPE (LIR). The recommendation here is to lease your IP space from someone that’ll sponsor you. Also note that applying for an ASN will require you are “multi-homed”, meaning that your paperwork will need to show you have contracts with at least two transit providers (see next step).

Once you have your ASN you can create IPv4 and IPv6 route objects in the RIPE DB so that transit providers can automatically whitelist your prefixes without an LOA.

Cost: $0

4. Get Connectivity and Virtual instances

You’ll acquire the last two building blocks of your network in this step. Getting connectivity and compute cycles has traditionally been a daunting step as it required getting real hardware in collocated space or an Internet Exchange (IX). By leveraging Virtual Private Servers (VPS) providers that support “Bring Your Own IP” you get easy access to both transit and virtual instances. While most networks peer with others, for simplicity we’ll rely solely on transit.

By leveraging Virtual Private Servers (VPS) providers that support “Bring Your Own IP” you get easy access to both transit and virtual instances.

Single home: My recommendation here is to use Vultr as your primary cloud+transit provider as they have 15 locations across 4 continents, have good BGP support (like communities), and will set you back $5 per month per PoP (BGP session fees included). And, if you don’t have an ASN, you can use theirs. Because of their high level of automation, once your BGP session has been approved it’s really as simple as spinning up VPS instances and announcing your space.

Multi-home: While this step is optional, redundancy is always recommended. The goal here is twofold. First, add a second provider in your primary markets (North America in my case), then maximize coverage. As I wanted to have presence on 6 continents and shore up APAC, I strategically picked Hong Kong, India, Brazil, and South Africa as my next expansions and set out to find providers in those regions.

After a lot of Googling and emails, I settled on these 3 providers:

HostUs : Hong Kong, Washington DC
Leapswitch : Mumbai
Host1Plus : Brazil, South Africa

Because of heterogeneous processes and environments this process will take some time. I ran into many issues such as IP binding, upstream filtering & approvals, invalid boot devices, BGP config issues, nuances with OpenVZ and IPv6, LOA requests, iptables and selinux madness, inoperable VNC/console and timezone delays. Despite this, the support staff was terrific which helped push the process along (and a special shout out to Andrew at HostUs for going above and beyond). VPS fees ranges from $5-$7 a month however they also tack on one time or recurring BGP session fees.
There are more VPS suggestions here. I also found Nat Morris’s great presentation about anycast on a shoestring when completing this write-up. It was interesting to see how we tackled the same problem for different audiences in very different ways. Some other providers that I considered were Zappie, Packet.net (bare metal), and Safehouse. YMMV.

Cost: $5 per PoP per month

5. Configure your cloud platform

Here’s where you setup your favorite web services on one instance. For HTTP I went with Caddy, a reverse proxy load balancer written in Go that’s been used by the likes of Netflix and Gopher Academy. It’ll give you HTTP/2 and TLS (through Let’s Encrypt) with just a few config lines. For DNS I went with NSD, a lightweight authoritative server used by some TLDs and root servers. The recommendation here is to configure basic health checks to validate BGP, DNS, and HTTP and to otherwise restart them in a logical fashion. Config management helps and do harden your boxes.

6. Announce your /24 & /48 from one PoP

With your platform ready, it’s time to announce it to the world. You’ll need to pick a BGP client like exabgp, bird (what I used), or quagga and configure 4 announcements: /24 and /48 (your entire IPv4/IPv6 space), and /32 and /128 (the instance). The former will get shared with the world and the latter only to your local router.

If you’re on OpenVZ virtualization you may skip the latter due to how your IPs are bound. `birdc show route` and `birdc show proto all` commands and BGP looking glasses are your friend. Vultr has a good bird guide. Once you form neighbors and announce your space to the world, move to the next step.

7. Announce from multiple PoPs

Once you’re announcing from a second site you’re officially anycasted.

After you have one host working, replicate your configs to multiple PoPs. Once you’re announcing from a second site you’re officially anycasted.

You can quickly validate your performance by using any global ping or traceroute tool that measures latency from multiple locations.

8. Correcting glaring routing issues

The network built so far is certainly not optimized and there will be routing woes. While anycast routing provides an elegant and simplified way of routing users based on routing policies and shortest hop count, it does have some pitfalls. BGP is not latency aware, not QoS aware, and not server load aware. Having administered a few large scale anycast DNS platforms in prior jobs, I have seen many cases of routing users to a sub-optimal country or continent, server load asymmetry, and routing over degraded networks.

For this guide I’ll cover what to spot check as a first pass to ensure that routing is at least somewhat sane. Because the Internet is constantly changing, a real anycast network needs to be continuously tuned.

the three main tools in your belt are communities, prepending, and selective announcements.

Because your upstreams have varied levels of connectivity, they will attract traffic from varied types of networks. In my case, some remote nodes were so well connected that they were pulling traffic from Europe and North America. While there are many knobs, the three main tools in your belt are communities, prepending, and selective announcements. Since not all providers supported self-service communities, and prepending failed to help, I needed to work with my upstreams.

Because my HTTP responses include a header with the PoP name, I could tag traffic in Catchpoint to show traffic patterns. The first half of the following graph shows global traffic tagged by the Mumbai PoP. After I requested that my upstream drop all its peers and announce only out of the National Internet Exchange of India, we see the situation improve.

HostUs, my Hong Kong provider, also worked closely with me to tune the communities.

9. GeoDNS to the rescue

use anycast globally and unicast strategically

Host1Plus’s Sao Paulo and Johannesburg nodes provided a couple challenges. First, my Sao Paulo BGP announcements were pulling traffic from North America, and the Johannesburg upstream delayed provisioning my BGP session for weeks. Because Host1Plus didn’t support self-service communities and they couldn’t assist me setting communities either, I decided to improvise. The plan was now to use anycast globally and unicast strategically. Because I was using a 3rd level domain, I was able to use my 2nd level DNS service, Route53, to advertise the unicast IPs of the Sao Paulo and Johannesburg nodes in those regions. Using GeoDNS for DNS name server A records is not common, however it is totally valid, and practical for 3rd level domains, and used by at least one of the top 5 Alexa sites.

In total, how much will this project set you back?

You can spend zero or as much as you want.

$0 : Use free IPv6 space and Vultr’s ASN + self-labeled “free VPS hosting
$65+ a month: Lease your IPv4 and IPv6 and spin up anywhere from 2 to 15 Vultr PoPs.
$190 a month + one time BGP session fees: The network described in this guide.

If you have reservations about spending money, when you consider how much some networking classes/labs cost, you may have the justification you need. If you’re more comfortable building in a laboratory environment rather than the real Internet, you can somewhat follow along using the DN42 network. DN42 is a private Internet operated by enthusiasts and is a safe sandbox.

What’s next?

There are many things to test, including peering, traffic steering, route dampening, and BGP convergence. You could try these yourself, or you could just wait… Because yes, I just gave you a preview of my next posts.



April 26, 2014
Protect Against DNS Bitsquatting with TLS

Bitflipping occurs when 1s and 0s spontaneously flip at various levels in the stack (memory, network, storage) and when not corrected, can cause erroneous hostnames to materialize.  My first exposure to bitflipping was at Yahoo!; my team had the foresight to acquire our bitflipped permutations many years prior.  As the topic of bitsquatting surfaces every now and again I decided to test the premise myself. I picked a key domain name that I knew carried an enormous portion of internet traffic, “rented” the bitflipped permutations (I unregistered the domains within 72 hours), and soon started receiving stray http requests intended for major web properties (the top requested host headers were for two tech giants with a combined market cap over 700B).

Completely absent from the http logs (but present on the authoritative DNS logs) were host names for sites that ran full TLS/SSL.  Clients directed to my Apache instance over :443 wouldn’t proceed since I wasn’t the correct entity…SSL for the win! While I could have generated self-signed certs, this was beyond the scope of my weekend, and many of the requests were from mobile apps which may not present an “accept a mistmatch/invalid cert” prompt.

In summary, while registering bit flipped permutations of your domain can improve security, far more mileage is achieved by simply migrating to full SSL.

Here’s how you can quickly explore the world of Bitflipping:

1. Select your domain of choice. Since bitflips are infrequent, your odds are improved by using a popular record.

2. Use this simple script I hacked together to find the permutations.

./bit_flip_permutations.py -d $resource_record

3. Quickly list out which domains aren’t registered yet.


for i in `./bit_flip_permutations.py -d $resource_record`;
do
    whois $i.com | egrep '^No match|^NOT FOUND|^Not fo|AVAILABLE|^No Data Fou|has not been regi|No entri';
done

4. Register them with a good Registrar (I recommend NameCheap, great DNS support), and don’t forget to unregister them within 72 hours.

5. You’ll get more data if you can run your own authoritative DNS server with full logging.



March 2, 2013
Benchmarking SPDY vs. HTTP in 4 steps

Conclusions first!: SPDY’s efficiency (header compression, TCP windowing && slow start, etc.) not only allows for high throttles, but the inline images are pushed along with the index, relieving the browser of first parsing/requesting the resources before they are sent. While SPDY’s response time was faster than HTTP for a single-domain page, a more real world scenario would involve sharding. SPDY saw no substantial improvement with sharding, but HTTP did (albeit requiring 24 times more sockets!).

image

image

image

Step 1. Configuring SPDY on the server

  • yum install mod_ssl && rpm -Uvh https://dl-ssl.google.com/dl/linux/direct/mod-spdy-beta_current_x86_64.rpm
  • Enable/Disable SPDY via /etc/httpd/conf.d/spdy.conf using “SpdyEnabled on”
  • Enable non-SSL SPDY to facilitate debugging using  “SpdyDebugUseSpdyForNonSslConnections 2” . Performance with SSL would be slightly different.
  • Turn on Keep-Alive and MaxKeepAliveRequests to 200 so that Apache isn’t a limiting factor. Didn’t bother with gzip since everything other than the index is already compressed.

Step 2. Configuring the Browser

  • Start Chrome with non-SSL mode enabled: chrome.exe –use-spdy=no-ssl
  • Display SPDY debugging @ chrome://net-internals/#spdy
  • Disable caching in Dev tools.

Step 3. Add artificial latency between you and the server using netem

image

Step 4. Creating the optimal test page

Since SPDY multiplexes all elements through a single TCP stream, I tested both with a single domain page and a 4-way shard. Domain sharding and 3rd party includes should be avoided for optimal SPDY performance.

image



February 17, 2013
Seeing EDNS client-subnet in two steps

1. Build a dig client with support

2. Query an Auth that speaks the language

Now that we have a compiled version of dig that supports including the client subnet into the query we’re able to query authoritative servers with the flags enabled.
Here’s what a regular query for our favorite video site looks like:

image
Notice that the A records handed back are in North America. Now let’s resolve the record for a client in China:

image
The response now has an additional CLIENT-SUBNET flag specifying this response is only valid for that subnet. The next difference is the lack of A records in the response, instead we get a CNAME chain which’ll require another lookup.

On the UDP side, an additional record of type OPT is included in both request and response with the extended data. At this time Wireshark doesn’t support displaying the specific data but a patch is available @ https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=7552

image



June 3, 2012
The Versatile DNS TXT record

Originally conceived as a simple descriptive holder the TXT record has been used (and abused) in more creative ways that any other DNS record type. I couldn’t find a comprehensive list of all its swiss-army uses so here’s my stab at it:

1. Determining your external facing resolver & EDNS availability & advertised buffer size

2. Querying Wikipedia from a console

3. Determining hostname (hostname.bind) and version (version.bind) of responding server

4. Geolocation Using Reverse Lookup aka GURL (ran into this one at a recent Meetup)

5. Amplification attacks

6. Preventing spam (SPF, DMARC, DKIM)

7. Authentication of domain ownership

8. DNS tunneling for breaking through firewalls or siphoning free internet



November 5, 2011
10 things I didn’t know about Amazon’s Cloudfront

After having migrated my blog to Amazon Web Services I decided to accelerate it using their CDN offering. Overkill? Perhaps. Gratifying? Absolutely!  With almost 20 worldwide PoPs the response times as seen by Pingdom plummeted during my migration last month:

image

Here are 10 things I didn’t know going in:

1. Cloudfront is barebones, offering only simple static caching. There are no accelerated proxies or advanced features like header manipulation, url rewriting, cookie exchanges, etc.

2. It is reliable and fast. In San Jose, I’m getting over a 5x improvement in response times compared to only using the EC2 origin:

image

Here are my numbers for the past 30 days based on Pingdom’s global polling:

image image

3. Origin max-age directives less than 3600 are rounded up to an hour so if your content is updated more frequently you’ll need to use invalidation, versioning, or not cache it at all.

4. There is no UI for invalidating content, it’s all done via APIs that you need to build, and there are costly monthly limits. Here’s a PHP implementation for single file invalidation.

5. If you want even more speed, consider using their “Route 53” DNS service which you can manage from within the same console as CloudFront’s.  Their authoritative DNS servers are in the same 20 worldwide PoPs.

6. Updating distributions (CNAMEs, invalidations, enabling https, etc.) can take 20 or more minutes to push to all edges.

7. Logging is disabled by default.  To enable it you’ll need to have an S3 bucket space.

8. CF has an aliases feature so take advantage of it to enable domain sharding. By using 2 or more CNAMEs the browser can make more concurrent requests. I’m using cdn and www.

9. CloudFront makes HTTP 1.0 requests so be sure your origin still correctly responds with gzipped content.  For example in nginx, uncompressed files are served even if compressed ones are requested for 1.0 requests.  To override this you can add this to your nginx.conf: “gzip_http_version 1.0;”

10. CloudFront is not included in the 1 year free Amazon AWS offer so expect a bill for CF as well as for any origin fetch bandwidth that exceeds your free monthly aggregated bandwidth.  There are 2 monthly fees, GB out (about 2 dimes per GB) and # of requests (‘bout a penny per 10k). You get lower prices if you commit for more. My bill for the month was 25 cents (~50k object requests):

image

Looking back, moving to EC2 and Cloudfront was a sound decision which not only reduced my monthly VPS expenses but greatly improved performance and reliability.



September 17, 2011
3 steps to getting an EC2 Centos VM running

I’ve moved to EC2 thanks to the 1 year free Usage Tier Amazon offer. Even after the free year is up, the price should be on par with most decent Xen VPSes.

1. Once you’re at the AWS Management Console, launch a new instance and when picking your Amazon Machine Image (AIM), select a rightimage_CentOS with a star and make sure to create a key pair:

image

2. Instances don’t come with an externally reachable IP so you’ll need to assign one using the Elastic IPs menu:

image

3. Add an ACL for SSH:

image

That’s it! Now you can ssh into it with your key:

image

Some tidbits:

1. nginx cannot be run from RPMs with this image, you’ll need to compile it:

2011/09/13 14:11:12 [emerg] 2405#0: eventfd() failed (38: Function not implemented)
2011/09/13 14:11:12 [alert] 2404#0: worker process 2405 exited with fatal code 2 and can not be respawn

2. You may want to rewrite /etc/yum.repos.d/CentOS-Base.repo to point to Centos mirrors (they’re pointing to Rightscale ones otherwise)



August 7, 2011
Visualizing how kernel 3.0’s initial congestion window increase is lowering response times

When the recent IETF internet draft matures to an RFC, it’ll be the first increase in initial window (cwnd / TCP_INIT_CWND)  increase since 2002. The implementation has already made its way into 2.6.39 earlier this year and I thought I’d take 3.0 for a spin and demonstrate the increase in small object acceleration it yields.  I’m testing using a VPS node 100ms RTT away and loading objects ranging from 4kB to 128kB :

image

image

image

image

The head start the large congestion window offers favors smaller objects and in the 8kB range, the entire content can be sent in a single round trip:

 

image

 

image



July 23, 2011
Visualizing how TCP window scaling improves throughput

Prior to IETF’s RFC 1323, the 16 bits allotted to describe the receive window size was limited to a maximum value of 65,535 bytes. Window scaling permits receive windows up to a gigabyte large through a scale factor defined in the first SYN and this approach was adopted in Windows and Linux in the 2000s:

image

In Windows 7 you can disable scaling factor with the following autotuninglevel command:

image 

image

 

In the latter case, the receive window’s max value is pegged at 64KB, while in the former, the window is left shifted 2 bits for a scaling factor of 2^2, or 256KB.  Using Microsoft’s TCP analyzer on a 5MB test file at 100 rtt it’s clear that in this case the scaled window allows a 2x improvement in throughput:

 

image



March 20, 2011
Song of my week: Clare Maguire – The Last Dance

Awesome cross between Sia’s Breathe me and Annie Lennox!



January 16, 2011
Nagios returns: Warning: Return code of 127 for check of service ‘foo’ on host ‘localhost’ was out of bounds.

My home servers were being flagged as down with the warnings below. Turns out my /usr/lib64/nagios/plugins/ directory got corrupted during an rsync. The easy fix was to reinstall the nagios-plugins rpm:

yum reinstall nagios-plugins

Wish all issues were this simple.

 [1236830775] Warning: Return code of 127 for check of service 'PING' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.  [1236830815] Warning: Return code of 127 for check of service 'Root Partition' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.  [1236830855] Warning: Return code of 127 for check of service 'SSH' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.  [1236830895] Warning: Return code of 127 for check of service 'Swap Usage' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.  [1236830925] Warning: Return code of 127 for check of service 'Total Processes' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.  [1236830965] Warning: Return code of 127 for check of service 'Current Load' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.  [1236830975] Warning: Return code of 127 for check of host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.  [1236831005] Warning: Return code of 127 for check of service 'Current Users' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists.  [1236831045] Warning: Return code of 127 for check of service 'HTTP' on host 'localhost' was out of bounds. Make sure the plugin you're trying to run actually exists. 


December 18, 2010
I fixed conntrack-viewer 1.3 for 2.6.18-194.el5

This neat Perl script for viewing your masqueraded connections via ip_conntrack hadn’t been updated since 2002 and was erroring out with the messages below. Fixing it involved correcting the regexes for the new version of netfilter. Since it’s GPLed I’m including the modified source here: http://www.mediafire.com/?pes7sb66vhgp77j

  Use of uninitialized value in getservbyport at ./conntrack-viewer.pl line 114. Use of uninitialized value in getservbyport at ./conntrack-viewer.pl line 115. Use of uninitialized value in length at ./conntrack-viewer.pl line 128. Use of uninitialized value in length at ./conntrack-viewer.pl line 133. Use of uninitialized value in length at ./conntrack-viewer.pl line 143. Use of uninitialized value in concatenation (.) or string at ./conntrack-viewer.pl line 151. Use of uninitialized value in string ne at ./conntrack-viewer.pl line 154. Use of uninitialized value in subroutine entry at ./conntrack-viewer.pl line 162. Use of uninitialized value in gethostbyaddr at ./conntrack-viewer.pl line 162. Use of uninitialized value in gethostbyaddr at ./conntrack-viewer.pl line 163.     


December 17, 2010
iptables won’t start, just exits with no message

I was setting up a router and had no netfilter dir under ipv4 so I tried to start iptables but it would exit with no message. I debugged the /etc/init.d/iptables script and determined it was exiting because there was no /etc/sysconfig/iptables file:

 start() {     [ -f "$IPTABLES_DATA" ] || return 1 

I created an empty one by touching it and now iptables starts. Why was there no error message? Whoever wrote the iptables start script at netfilter HQ never put one in!



December 13, 2010
Ephemeral port exhaustion with php-fpm / mysql / nginx

On my LNMP rig I was running out of ephemeral ports even after extending the range across all registered ports in my sysctl.conf:

net.ipv4.ip_local_port_range = 1024 65000

Turns out php-fpm and mysql were both using TCP, requiring at least two 127.0.0.1 connections per page request. With the TIME_WAIT minimum fixed at 60 seconds on Centos the most I could handle was about 30k page requests per minute (64k ports / 2). After setting both php-fpm (/etc/php-fpm.d/www.conf: listen = /var/run/php-fpm/default.socket) and mysql (/etc/my.conf: socket=/var/lib/mysql/mysql.sock) to use sockets and changing my php code to use them (mysql_connect(‘localhost:/var/lib/mysql/mysql.sock’,…) my ephemeral port usage is down to zero. Don’t forget to also adjust net.ipv4.ip_conntrack_max and the related timeouts to allow more connections.



December 9, 2010
nginx buffer errors during cacti install

nginx was working fine until I installed cacti.  I kept getting 502 errors and this message:

 2010/12/09 19:03:39 [error] 20002#0: *283 upstream sent too big header while reading response header from upstream, client: 192.34.56.2, server: _, request: "GET /cacti/ HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "192.23.2.2" 

The solution is to increase the buffers from their defaults which are listed here: http://wiki.nginx.org/HttpFcgiModule I tried various values and the minimum values that would work with cacti were the following in your /etc/nginx/nginx.conf:

 location ~ \.php$ { fastcgi_buffers 8 16k; fastcgi_buffer_size 32k; } 


December 1, 2010
tcp_fin_timeout doesn’t control TIME_WAIT

Doing some tcp tuning and noticed a ton of TIME_WAIT connections.  Many online guides suggested setting /proc/sys/net/ipv4/tcp_fin_timeout to 15 but netstat with timers was still counting down from 60:

Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0      0 localhost.locald:cslistener localhost.localdomain:53113 TIME_WAIT   timewait (57.46/0/0)
tcp        0      0 vps.server.com:http         c-123-123-123-123.xyz.:2944 TIME_WAIT   timewait (57.55/0/0)

Turns out you can’t lower TIME_WAIT on linux from 60 seconds since it’s hard coded in the source:

#include /net/tcp.h
#define TCP_TIMEWAIT_LEN (60*HZ) /* how long to wait to destroy TIME-WAIT

The temp workaround is to enable tcp_tw_recycle or better, tcp_tw_reuse.

BTW, BSD doesn’t do this and does use TCP_TIMEWAIT_LEN.



November 28, 2010
"bash: rpm: command not found"

Tired and ready for bed I accidentally deleted yum and rpm through the dependency magic of “yum –erase crontabs”.  Without rpm, how was I going to install rpm?  Since this was on a minimal image on a VPS I had few options:

1. Repair install from media – not an option on a cheapo VPS

2. rsync from identical remote machine – rsync wasn’t installed on local host

3. compile from from source – gcc not installed on local host

4. reset image to original on VPS – nope, that is admitting defeat

I ended up tarring up all the files from rpm and rpm-libs on an identical remote host and copying those over:

#on remote host send list of all files of the rpm and rpm-libs package into a temp file
rpm -ql rpm > /tmp/rpmList
rpm -ql rpm-libs >> /tmp/rpmList

#copy all the files into tarball
tar cvf rpm.tar `cat /tmp/rpmList`

#send tarball to broken machine
scp -P12345 rpm.tar root@remote:/

#untar and problem fixed
cd /
tar xvf rpm.tar


November 23, 2010
High speed ffmpeg cluster encoding with Python and avidemux

When it comes to clustered video codec conversion there are two general scenarios:

Scenario 1: Encoding many videos across many computers
Scenario 2: Encoding a single video across computers

Scenario 1 is ubiquitous and most encoding clusters are likely running at full steam with a backlog of videos waiting in queue. Scenario 2 is less common and useful with deadlines, where concertedly converting a single video across your cluster would reduce time tremendously.

I searched the google cavern for scenario 2 and didn’t find any existing ffmpeg cluster implementations so I spent my Sunday afternoon writing a python script to do just that.  Now, using the 4 pcs at home I’m converting a single video 300% faster.  So how does it work?  In a sentence, I split the encoding into ffmpeg tasks (using –ss and –t), distribute the tasks to my cluster, and copy the parts into the final version using avidemux (–append and –rebuild-index).   Is it perfect?  Probably far from it.  But as a first draft it worked great.  I tested several sources and formats and the video/audio merged seamlessly and in sync.  The code has no error catching and you may need to massage the code to work in your setup.  I’ll work on a second draft converting to h.264 instead of flv.


#!/usr/bin/python
# Version 0.1
# Big todo is adding error catching

import sys
import os
from re import search
from subprocess import PIPE, Popen

#configure the two parameters below
#1. The name of all the hosts in the cluster that will participate
hostList = ['one', 'two', 'three', 'four']
#2. The NFS mounted dir which contains the video you need encoded
encodeDir = "/net/ffcluster"

#Function definitions
def getDurationPerJob(totalFrames, fps):
return totalFrames / float(fps) / len(hostList)

def getFps(file):
information = Popen(("ffmpeg", "-i", file), stdout=PIPE, stderr=PIPE)
#fetching tbr (1), but can also get tbn (2) or tbc (3)
#examples of fps syntax encountered is 30, 30.00, 30k
fpsSearch = search("(\d+\.?\w*) tbr, (\d+\.?\w*) tbn, (\d+\.?\w*) tbc", information.communicate()[1])
return fpsSearch.group(1)

def getTotalFrames(file, fps):
information = Popen(("ffmpeg", "-i", file), stdout=PIPE, stderr=PIPE)
timecode = search("(\d+):(\d+):(\d+).(\d+)", information.communicate()[1])
return ((((float(timecode.group(1)) * 60) + float(timecode.group(2))) * 60) + float(timecode.group(3)) + float(timecode.group(4))/100) * float(fps)

def clusterRun(file, fileName, durationPerJob, fps):
start = 0.0
end = durationPerJob
runCount=0
jobList=[]
#submits equal conversion portions to each host
for i in hostList:
runCount += 1
runFfmpeg = "ssh %s 'cd %s;ffmpeg -ss %f -t %f -y -i %s %s </dev/null'" % (i, encodeDir, start, end, file, fileName + "_run" + str(runCount) + ".flv")
start += end + 1/float(fps)
jobList.append(Popen(runFfmpeg, shell=True))
#wait for all jobs to complete
runCount=0
for i in hostList:
jobList[runCount].wait()
runCount += 1
#append/rebuild final from parts and rebuild index
avidemuxHead = "avidemux2_cli --autoindex --load %s_run1.flv --append %s_run2.flv " % (fileName, fileName)
avidemuxTail = "--audio-codec copy --video-codec copy --save %sFinal.flv" % (fileName)
#add --appends for additional host above the first 2
for i in range(len(hostList)- 2):
avidemuxHead = "%s --append %s_run%d.flv " % (avidemuxHead, fileName, i+3)
runAvidemux = "%s %s" % (avidemuxHead, avidemuxTail)
Popen(runAvidemux, shell=True)

#Main begin
sourceFile = sys.argv[1]
fps = getFps(sourceFile)
totalFrames = getTotalFrames(sourceFile, fps)
durationPerJob = getDurationPerJob(totalFrames, fps)
fileName = os.path.splitext(sourceFile)[0]

clusterRun(sourceFile, fileName, durationPerJob, fps)


November 20, 2010
Spammers brought down the E-classifieds.net script

My first foray into the world of web programming was back in 2002, a time when websites didn’t need to verify you were a bot. Accounts could be created and forms submitted without captcha verification.  Obviously times have changed and tonight the e-classifieds script I’ve maintained for 8 years got suspended with the following message from the ISP:

Domain has exceeded the max emails per hour (200) allowed. Message discarded. User xxx has been suspended for the following reason:
Spamming

Some spammer was using a vulnerability to auto-form spam. I hadn’t touched the code in years but to fix the problem I was going to have to roll up my sleeves and stuff a captcha in the important forms.   First stop, which captcha to use?   Sure I could use recaptcha but that would require installing a cpan perl module on a shared account which I didn’t want to deal with so I searched google and ended up with http://bumblebeeware.com/captcha/, a simple perl script that worked great.  If you’re running e-classifieds and in the same predicament, give that script a shot.



November 16, 2010
VLC error – cannot open esound socket from ALSA lib

Turns out VLC won’t run as root, so I xhost +, su as an unprivileged user, and manage to get the video going but there’s no sound. Instead the terminal is flooded with the errors below, and searching the errors for 30 minutes on google doesn’t return anything conclusive. Then I think, hmmmm, the video card is probably privileged. Log out of Gnome as su, back in as an unprivileged user, and presto, it worked. I wonder what the xhost+ equivalent for the audio card is?

ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:3985:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2184:(snd_pcm_open_noupdate) Unknown PCM default
[00000492] oss audio output error: cannot open audio device (/dev/dsp)
ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:3985:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2184:(snd_pcm_open_noupdate) Unknown PCM default
ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:3985:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2184:(snd_pcm_open_noupdate) Unknown PCM default
[00000492] esd audio output error: cannot open esound socket (format 0x00001021 at 44100 Hz)
[00000492] arts audio output error: arts_init failed (can't connect to aRts soundserver)
ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:3985:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2184:(snd_pcm_open_noupdate) Unknown PCM default
ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:3985:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2184:(snd_pcm_open_noupdate) Unknown PCM default


June 23, 2010
Reducing Disk IO by data reduction via MySQL aggregate functions

On one of my past hobby projects I had thousands of data points being generated. As I was only interested in the most recent value for each instance, I had a nightly cron job reduce the data, but my database got too large and kept triggering the IO rate alarm. By using the aggregate function MAX on mysql_fetch_assoc I deleted instances that hadn’t been active in 30 days:

$query = mysql_query("SELECT Id, MAX(last_action) FROM Group GROUP BY Id");
$count = mysql_num_rows($query);
$goBack30Days=60*60*24*30;
if ($count > 0) {
while ($row = mysql_fetch_assoc($query)) {
$lastContact = $row['MAX(last_action)'];
$Id = $row["Id"];
if ($lastContact < (time() - $goBack30Days)) {
mysql_query("DELETE FROM Group WHERE userId='$Id' ");
}
}
mysql_free_result($query);
}


June 11, 2010
libQtGui.so: undefined reference to `FcFreeTypeQueryFace’

When building in QT I got the following error:

/software/qt/qt/lib/libQtGui.so: undefined reference to `FcFreeTypeQueryFace’
collect2: ld returned 1 exit status

The solution can be found here but that wasn’t enough. While compiling fontconfig-2.4.2 I had missing dependencies:

checking for LIBXML2… configure: error: Package requirements (libxml-2.0 >= 2.6) were not met:  No package ‘libxml-2.0′ found

Installing libxml2-devel (not libxml2) got me moving again.



June 10, 2010
Qt Error: “No valid Qt version set. Set one in Tools/Options”

I installed the Qt SDK and got this error:

No valid Qt version set. Set one in Tools/Options
Error while building project basiclayouts
When executing build step ‘QMake’
Canceled build.

It wasn’t clear why the qmake binary hadn’t been auto-detected and where to find it:

image

Even though I launched QT Creator from qt/bin, and there were a couple files and directories called qmake here and there, I had to manually point it to a different bin dir qt/qt/bin/qmake as so:

image



February 26, 2010
Building against Static Qt for portability

I ported my tip calculator from Android to Qt on Centos and tried to run it on a RHEL4 host and got the following:

>./tipCalc.Dynamic

Floating exception

tipCalc was only 40kB and ldd revealed it was trying to load shared libraries I didn’t have on that host:

>ldd tipCalc.Dynamic
libQtGui.so.4 => /software/qt/qt/lib/libQtGui.so.4 (0×006d6000)
libQtCore.so.4 => /software/qt/qt/lib/libQtCore.so.4 (0×00110000)

To build a static app required 2 steps:

1.  Rebuild Qt libraries statically:

>configure -static -prefix /software/qtStatic/ -make libs -make tools -release -nomake examples -nomake demos

>gmake

the demos and examples take a long time to compile (and diskspace, I ran out of it twice in Vmware) so I left them out with –nomake

2. Add the following options to the project .pro file:

CONFIG += staticlib
CONFIG += release

Now tipCalc.Static is 12MB but doesn’t require shared Qt libraries.



December 10, 2009
The checksum & palindrome algorithm

With a restaurant checksum for my tip calculator, the sum of the dollar digits is equal to the cents. Here’s my checksum code:

   1: int nAddUpLeftDecimals(int nX)

   2:     {

   3:         int remain, sum=0;

   4:         nX = nX/100;

   5:         while(nX >= 1)

   6:         {

   7:             remain = nX%10;

   8:             sum = sum + remain;

   9:             nX=nX/10;

  10:         }

  11:

  12:         return sum ;

  13:     }

  14:

  15:

  16: int Checksum(int nX) {

  17:

  18:         while(nAddUpLeftDecimals(nX) != nX%100 ) {

  19:             nX += 1;

  20:         }

  21:         return nX;

  22:     }

and my palindrome code:

   1: boolean nCheckPalindrome(int nX) {

   2:

   3:     int yy,xx,zz=0;

   4:     yy = nX;

   5:     while(nX>0)

   6:     {

   7:       xx=nX%10;

   8:       nX/=10;

   9:       zz=zz*10 + xx;

  10:      }

  11:

  12:     if (zz==yy)

  13:      return  true;

  14:    else

  15:      return  false;

  16:

  17: }

  18:

  19: int Palindrome(int nX) {

  20:     while(!nCheckPalindrome(nX)) {

  21:         nX += 1;

  22:     }

  23:     return nX;

  24: }



December 1, 2009
Debug.startMethodTracing stopped unexpectedly

I kept getting this error when trying to use Traceview.  Turns out I had to enable SD card writing in the manifest:

<uses-permission android:name=”android.permission.WRITE_EXTERNAL_STORAGE” />



November 23, 2009
Profiling with Traceview using the Android SDK

I noticed my seekbars weren’t responsive or fluid so I used the profiler to check what was wrong.  I swiped the seekbar 3 times and you clearly see that SQLLite was the problem:

image

I improved the code and the only spike was the initialization of the SQL database which can be seen in both traces:

image