October 30, 2012 at 5:18 pm #7990QuestionableParticipant
I’m on placement and I will be soon attempting on writing a web crawler which will be used to test the companies site vulnerabilities I will be starting with SQL injection but will eventually have OWASP Top 10 embedded. I will either be creating this in visual c# or in MVC. It will be a long project so i’d just like some advice on where I should start.
I know that I will need to use things like regular expressions and parse data searches through pattern matching. I will be looking at http://arachnode.net for some ideas on how the process works on the crawler, but if anyone has done something similar, has any links or articles that you could link me to would be great.
October 30, 2012 at 7:27 pm #50722dynamikParticipant
I haven’t done anything like this on the Windows side, but as a general word of advice, spend a good deal of time up-front researching built-in and third-party libraries. I’ve done things like this with other languages and spent more time than I would have liked toiling over some nitty-gritty details that could have been handled by an existing library in just a few lines of code.
For example, searching for C# html parser led me to this: http://htmlagilitypack.codeplex.com/
I’d break down all the complex tasks you’re planning on performing into a series of basic tasks and spend time researching how others have tackled similar problems. Do whatever you can to avoid reinventing the wheel.
Also, consider using IronPython to integrate Python into your project. Python has many excellent libraries for these types of projects, and it may be easiest to go this route instead of trying to recreate a library that doesn’t have a C# equivalent.
Sorry I couldn’t offer more specific advice. Good luck, and let us know how the project shapes up.
October 30, 2012 at 10:11 pm #50723QuestionableParticipant
Oh no, this has been quite helpful. I already have taken a look at http://htmlagilitypack.codeplex.com/ I will be making a similar project at home since the code I write at work is regarded as the companies, I will update this post when I have started to get into the project.
I will take a look at IronPython, I did think I would have to attempt to either port code from python or use python libraries inside of .NET.
- You must be logged in to reply to this topic.