Jose Nazario

a 30-something technologist who used to be a biochemist. now i travel the world helping people secure their networks. all from ann arbor, mi.

here you'll find my blog, projects, pictures, my wiki, and much more. what you wont find is a strong set of design skills.

scraping VZB network latency tables in F#

yesterday for work (and reddit's /r/dailyprogrammer) i wrote a small HTML table scraper in F#. i could have used some of the HTML tools (like an HTML provider, or the Html PowerPack) but i chose to do it with only regexes. data pipelines are now super easy. the pieces are below, and the example program will spit out some latencies. you can easiy compose new ones.

to get all latency tables, here's a start. transforming it to something else is an exercise left to the reader.

html |> tables |> List.filter (fun x -> x.IndexOf( "Latency") > 0) |> (fun x -> rows x |> List.filter (fun x -> x.IndexOf( " (fun x -> cells x |> (fun x -> stripHtml x) ))


scraping HTML tables in #fsharp to get internet latencies from VZB's network stats page

— jnazario (@jnazario) December 16, 2014

     [link]      Tuesday, Dec 16, 2014 @ 10:34am


