URL Looping Problem

Hi,

I am very new to Python, but I have lots of experience with Stata and other statistical software. I am trying to create a database that pulls information from the International Trade Commission (ITC) website. Essentially, I want to know all the details they have of every patent infringement ITC case ever. I have wrote this VERY SIMPLE (at least I think it is) python script that will get all the data I want off of 1 page. The problem is, each case is one page.

This link http://info.usitc.gov/ouii/public/337in ... l?OpenView shows all the cases the ITC has had (over 770 of them). Clicking on each specific case brings up a URL like this one... http://info.usitc.gov/ouii/public/337in ... enDocument.

The code I wrote uses this one URL, but I want it to include all URLS and make a different observation for each one. The only difference in each URL is a series of letters and numbers in the middle. I have made those bold on two separate case links below for illustration:

http://info.usitc.gov/ouii/public/337inv.nsf/56ff5fbca63b069e852565460078c0a[b]e/2f6aea0c070b20e3852578930070cf1f[/b]?OpenDocument

http://info.usitc.gov/ouii/public/337inv.nsf/56ff5fbca63b069e852565460078c0ae/[b]4878841e49b2ec2d8525770a00704997[/b]?OpenDocument

OK, so if anyone can help me I would REALLY REALLY appreciate it.

I have attached my script.

FYI: I made this script by copying a lot of it from someone who ripped data off of a basketball stats website and just replaced terms, so that is why you see words like stats and such. I could not tell you the meaning of any of it because I am so new to this.

THANKS SO MUCH IN ADVANCE!

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories