Ranklist Generator
WARNING: Technical stuff ahead. Proceed at your own risk.
Here is the post I promised about the rank list generator I built. It is a tiny program that extracts the CGPA and SGPA of every student, saves them in a file and then sorts them according to the SGPA and CGPA and saves them in two separate files. I used Python 2.7, with “requests” library for sending and receiving data and BeautifulSoup library to parse the web page received. The hardest part in this entire project was sending data. So, I’ll only deal with that. For everything else, which was very easy, here’s the code.
The main page (http://117.211.91.61/web/Default.aspx) has a text input field. On reading the page source, we find 9 input tags, of which 5 are hidden. The names of the important hidden input statements are ToolkitScriptManager1_HiddenField
, __EVENTTARGET
, __EVENTARGUMENT
, __VIEWSTATE
, txtRegno
, btnimgShow
and btnimgShowResult
along with a <select>
statement named ddlSemester
. We can extract the value of each of these variables with an HTML parser. Once we have the values of these variables, we can now create a POST request (since the <form>
tag has “method = POST”).
But before we do so, we need to see the POST requests format. So we navigate to the page, open Network tab (under Google Chrome and Firefox), enter the registration number in the text field and press the “Show” button. We are taken to another page (with the same address). The Network tab shows a list of objects loaded. We right-click and copy the request headers and also the entire request as cURL. Now we can totally construct the POST request.
From the copied content,
headers = {
'Host': '117.211.91.61',
'Connection': 'keep-alive',
'Content-Length': '377',
'Cache-Control': 'max-age=0',
'Accept': 'text/html,application/xhtml+xml,application/ xml;q=0.9,image/webp,*/*;q=0.8',
'Origin': 'http://117.211.91.61',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.65 Safari/537.36',
'Content-Type': 'application/x-www-form-urlencoded',
'Referer': 'http://117.211.91.61/web/Default.aspx',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-US,en;q=0.8,hi;q=0.6'
}
Upon observing the end of the cURL code, we can find the values of variables being passed.
data_raita = {
'ToolkitScriptManager1_HiddenField': '',
'__EVENTTARGET': '',
'__EVENTARGUMENT': '',
'__VIEWSTATE': cur_state,
'txtRegno': 'EL112132',
'btnimgShow.x': '34',
'btnimgShow.y': '10'
}
where cur_state = whatever_viewstate_we_extracted_from_the_page
Important: Note that we click the “Show” button on the side, not the “Show Result” button at the bottom. Hence the use of “btnimgShow”.
Upon passing these values, we are taken to another page, now with a different value for all the hidden variables. So, now we have a new set of data to be passed (again, format retrieved from the Network tab),
data_soup = {
'ToolkitScriptManager1_HiddenField': '',
'__EVENTTARGET': '',
'__EVENTARGUMENT': '',
'__VIEWSTATE': next_state,
'txtRegno': 'EL112132',
'ddlSemester': '5',
'btnimgShowResult.x': '25',
'btnimgShowResult.y': '11'
}
where next_state = whatever_viewstate_we_extracted_from_the_following_page
Thus we get the actual result upon passing this data_soup data. Now we can extract the value of “lblCPI” and “lblSPI” spans and save them to a file along with the “lblRollno”.
That is pretty much the essence of this tiny piece of code. I’ll update the post with a link to the code as soon as I push it online.
And of course, a happy new year!!! :-)