Image
 
linkedin_logo.png rss_logo.jpg
twitter_logo.png youtube_logo.jpg
Latest Additions
 
EH-Net Login
Welcome Guest.






Lost Password?
No account yet? Register
Who's Online
We have 40 guests online
 
Advertisement

You are here: Home arrow Ethical Hacking Discussions and Related Certificationsarrow Programmingarrow Web crawler C#
EH-Net
May 18, 2013, 08:40:17 PM *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
News: Go back to The Ethical Hacker Network Online Magazine Home Page
 
   Home   Help Calendar Login Register  
Pages: [1]   Go Down
  Print  
Author Topic: Web crawler C#  (Read 3494 times)
0 Members and 1 Guest are viewing this topic.
Questionable
Newbie
*
Offline Offline

Posts: 13


View Profile
« on: October 30, 2012, 12:18:00 PM »

Hi,

I'm on placement and I will be soon attempting on writing a web crawler which will be used to test the companies site vulnerabilities I will be starting with SQL injection but will eventually have OWASP Top 10 embedded. I will either be creating this in visual c# or in MVC. It will be a long project so i'd just like some advice on where  I should start.

I know that I will need to use things like regular expressions and parse data searches through pattern matching. I will be looking at http://arachnode.net for some ideas on how the process works on the crawler, but if anyone has done something similar, has any links or articles that you could link me to would be great.

Thanks,
Logged

We can re-code him, we have the technology!
ajohnson
Recruiters
Hero Member
*
Offline Offline

Posts: 1056


aka dynamik


View Profile WWW
« Reply #1 on: October 30, 2012, 02:27:10 PM »

I haven't done anything like this on the Windows side, but as a general word of advice, spend a good deal of time up-front researching built-in and third-party libraries. I've done things like this with other languages and spent more time than I would have liked toiling over some nitty-gritty details that could have been handled by an existing library in just a few lines of code.

For example, searching for C# html parser led me to this: http://htmlagilitypack.codeplex.com/

I'd break down all the complex tasks you're planning on performing into a series of basic tasks and spend time researching how others have tackled similar problems. Do whatever you can to avoid reinventing the wheel.

Also, consider using IronPython to integrate Python into your project. Python has many excellent libraries for these types of projects, and it may be easiest to go this route instead of trying to recreate a library that doesn't have a C# equivalent.

Sorry I couldn't offer more specific advice. Good luck, and let us know how the project shapes up.
Logged

WIP: GCFA | www.infosiege.net | @infosiege

The day you stop learning is the day you start becoming obsolete.
Questionable
Newbie
*
Offline Offline

Posts: 13


View Profile
« Reply #2 on: October 30, 2012, 05:11:23 PM »

Oh no, this has been quite helpful. I already have taken a look at http://htmlagilitypack.codeplex.com/ I will be making a similar project at home since the code I write at work is regarded as the companies, I will update this post when I have started to get into the project.

I will take a look at IronPython, I did think I would have to attempt to either port code from python or use python libraries inside of .NET.
« Last Edit: October 30, 2012, 05:14:36 PM by Questionable » Logged

We can re-code him, we have the technology!
Pages: [1]   Go Up
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines
Joomla Bridge by JoomlaHacks.com
Valid XHTML 1.0! Valid CSS!
Page created in 0.072 seconds with 23 queries.
 
Exclusive Deal

sansfire13_245x90_cw90.jpg
SANSFIRE 2013
June 15 - 22

5% Off w/ Code: EHN_5

SANS Deals 4 EH-Netters
5% OFF Any SANS Course in Any Format!
Coupon Code: EHN_5 Including SANS Rocky Mountain 2013 & SANS Boston 2013
Polls
Compared to this year, 2013 will be:
 
Recent Forum Topics
EH-Net News Feeds
Latest Additions
 
         
Free Business and Tech Magazines and eBooks

© 2013 The Ethical Hacker Network
Joomla! is Free Software released under the GNU/GPL License.