Monthly Archives: June 2008

Python Performance Part 2 Redux: Split & Reduce Large Strings for 'A Href' Hypertext

split_2.py def get_value(a): return a[1:a.find(">")-1] hrefs = map(get_value,open(“hypertext.html”,”r”).read().split(“<a href=”)) Timing Comparison: ~ 300% Performance Improvement Note: hypertext.html is 48MB. braydon@bgf:~/python_tests/extract$ time python split.py real 0m1.263s user 0m1.112s sys 0m0.156s braydon@bgf:~/python_tests/extract$ time python split_2.py real 0m0.392s user 0m0.268s sys 0m0.120s split.py … Continue reading

Posted in Code, Hacking | Tagged | Leave a comment

Python Performance Part 2: Parsing Large Strings for 'A Href' Hypertext

Goal Write a fast Python script that will take a large string and reduce it to a list of all of the hyperlinks in the html string; such as [”http://world.org”,”/tree”]. Attempt 1: Self-Recursion f = open(‘hypertext_sm.html’,'r’) ahrefs = [] count … Continue reading

Posted in Code, Hacking | Tagged | Leave a comment

Python Performance Part 1: Transforming Large Lists into Seperate Smaller Lists

Goal Write a fast Python script that will take a large list and break it up into smaller sub-lists based on a set size; such as transforming [a,b,c,d,e,f] into [[a,b],[c,d],[e,f]]. Attempt 1: Map/Reduce (0.93s) #import a list of 247,213 integers … Continue reading

Posted in Code, Hacking | Tagged | Leave a comment