Go Back   HTMLCenter Web Development Forums > Web Design and Development > Programming and Scripting
Connect with Facebook

Web Data Extraction, Spiders, Scrapers..

Closed Thread
 
LinkBack Thread Tools Rate Thread Display Modes
  #1 (permalink)  
Old 12-05-2008, 10:15 AM
Registered User
 
Join Date: Dec 2008
Posts: 1
Web Data Extraction, Spiders, Scrapers..

Hey!

Im new over here, so a small introduction. I am from Canada, Toronto, and run a small financially focused website.

The problem - many financial insitutions publish their data online, and update it on daily basis. There are over 60 institutions, and to follow each one is very challenging. I want to create a summary page with financial data from those institutions. Release a spider once a day, get their updates, and then post them all together on the website.

Obviosuly copy&paste is off the table since it takes at least 1.5 hour to go through all lenders and get their data. The only possible solution it seems is to set up a custom spider who will crawl specific fields (div tags, table cells), extract data and compile it into one file. The question is - do you know any software that is capable of doing this? I know there are plenty of scrapers out there, but the requirement for a spider is to be able to extract data from specified table cells and in some cases div tags.

I cant go to a data extraction company since they charge too much (do they?). Please let me know if you're aware of any applications that can match those requrements.

Any help guys! Thanks!

  #2 (permalink)  
Old 12-06-2008, 08:40 AM
curtiss's Avatar
Moderator
 
Join Date: May 2003
Posts: 1,533
If you know, specifically, what information you need to extract from the pages, you can try using file_get_contents. That will only work if the banks' servers have allow_url_fopen enabled. If it does work, you can simply write a script that runs through a loop of all of the pages from which you want to extract information.
__________________
I hate Internet Explorer! Anyone with me?
Closed Thread


Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
text box data loaded from external css Bad Inferno Programming and Scripting 2 09-09-2008 04:44 AM
Parse Error: Adding to currently existing data EternalWolf Databases 4 08-08-2008 07:15 AM
Accessible, Proper Multi-Level Tabular Data curtiss Programming and Scripting 2 03-12-2007 09:00 PM
data from forms tony the ferret Programming and Scripting 1 11-02-2006 02:29 PM
Clueless-user-friendly Table Data obloquy Programming and Scripting 1 12-25-2005 07:43 AM


All times are GMT -5. The time now is 02:07 PM.

 
Clicky Web Analytics
CloudContacts
Loop11
Page.ly


Subscribe to our feed | add to myYahoo!

Powered by vBulletin® Version 3.8.5
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.5.1 PL1
© 1997-2009 HTMLCenter