【选课脚本】用Python网页爬虫来进行选（qiang）课（更新至v1.0.8）--慧智精品网

【选课脚本】⽤Python⽹页爬⾍来进⾏选（qiang）课（更新⾄v1.0.8）

0x00 前⾔

每当选课的时候，都如同打仗⼀般

都有⾃⼰想要的课，但是名额就那么⼀点

于是各显神通，有⼈⽤ js，有⼈⽤ chrome 的 console

⼈⽣苦短，我⽤Python

(Last Update: 2020/09/22 版本号v1.0.8)

0x01 环境依赖

Python 3.x

如果你想看 html 的结果，最好有个浏览器

beautifulsoup4>=4.6.0

bs4>=0.0.1

configparser>=3.5.0

lxml>=3.7.3

requests>=2.13.0

tqdm>=4.11.2

0x02 使⽤⽅法

获取程序

你可以直接git clone最新版本的程序

$ git clone github/okcd00/CDSelector.git

$ cd CDSelector

$ vim config 修改登陆信息

$ vim courseid 修改选课信息

$ python CDSelector

修改⽂件 config

[info]

username = [你的SEP登陆帐户名，通常是个邮箱]

password = [你的密码，不⽤加双引号框起来]

runtime = [打算每隔多少秒尝试选课⼀次]

[action]

debug = false [debug模式输出⼤量的中间变量，为节省资源可设置为false]

enroll = true [轮询模式下⽆限循环尝试，没想过什么情况下需要设为false]

evaluate = true [验证选课成功与否，建议开启]

select_bat = false [打包选课，应⽤于类似英语B这种不让单独选，必须同时选“听说+读写”两个课才允许提交表单的特殊情况]

修改⽂件 courseid

⼀⾏是⼀门课，写成类似下⽂⽰例中⼀样，每⾏⼀门课的学院+编号即可

本次选课系统的更新中（此处指v1.0.8版本）课程编号和学院编号脱钩，所以⽬前需要在 courseid ⾥⼿动增加学院名称，"学院"⼆字允许省略，但前两个字必须得对。

计算机:081203M04003H

公管学院:030100M01004H

特别的，如果这门课你需要选成学位课的话，后⾯要加个 on，也是⽤冒号隔开

计算机学院:081203M04003H:on

公管:030100M01004H:on

然后运⾏ CDSelector.py

python ⽂件名.py 是 PYTHON 代码的运⾏模式，如果你发现你安装完python之后，我的 CDSelector.py ⽂件双击就可以直接执⾏的话，是同样的效果。

$ python CDSelector.py

Debug Mode: True

Enrolling start

> Course Selection is unreachable or not started. <1134> Thu Jun 01 08:43:42 2017

如果显⽰ImportError: xxx，就是说缺少了某些python包，使⽤下⾯的指令直接安装即可，pip是随着python安装的时候⾃带的⼀个⼯具，不⽤额外下载。

$ pip install xxx

当然你如果稍微熟悉⼀些 python，也可以⼀次性安装所有依赖项

$ pip install -

0x03 Source Code

代码⽐原先长了不少，全贴在这的话⽐较影响观看体验，移到⽂末最新的详细代码可移步

v1.0.0 web端访问部分参考了 scusjs 的实现⽅式，功能强化参考了 zoecur

psp侠盗猎车自由城秘籍v1.0.7 感谢 bobo334334 提供错误样例，感谢 xzqforever 提供帐号测试

(Updated: 2017/09/07) 选课系统参数微调，某些学院的课⽆法正常选上

v1.0.8 感谢 daiiwei 同学提供帐号⽤于测试

(Updated: 2020/09/22) 这回SEP选课系统改了不少，⼤改。版本号 v1.0.8

更新了学院ID词典，由课程编号前两位改成了⽆规律的3位整数

“403 Forbidden” 更加多发，增加了多种headers防⽌403

“会话过期重新登录”更加频繁，新采⽤Cookie模式以维持登录状态

优化⽇志输出，并在关键页⾯保存离线页⾯。在选课系统流量爆炸时提供轻量级本地⽹页查看，通过repository ⾥预设的

js/css，允许仅加载⽹页源码，可以达到相对轻量级的可视化检查。

0xFE 获取途径

Github： github/okcd00/CDSelector

Release： github/okcd00/CDSelector/releases

说明⽂档： blog.csdn/okcd00/article/details/72827861

鸣谢：

Mailto: zoecur@icloud

Mailto: scusjs@foxmail

0xFF 单⽂件代码⼀览 (404⾏)

还是有同学习惯于⼀个页⾯得到⾃⼰需要的信息，不太喜欢跳转到Github (最近访问速度也不快)

也照顾到喜欢我以往的单⽂件实现风格的同学，还是在这贴⼀下吧。

# coding = utf8

# =====================================================

# filename : CDSelector.py

# author : okcd00 / okcd00@qq

# date : 2020-09-22

# desc : UCAS Course_Selection Program

# =====================================================

import re

import os

import sys

import time

import requests

from bs4 import BeautifulSoup

from configparser import RawConfigParser

index_course ={

'910': u'数学','911': u'物理','957': u'天⽂','912': u'化学','928': u'材料',

'913': u'⽣命','914': u'地球','921': u'资环','951': u'计算','952': u'电⼦',

'958': u'⼯程','917': u'经管','945': u'公管','927': u'⼈⽂','964': u'马克',

'915': u'外语','954': u'中丹','955': u'国际','959': u'存济','946': u'体育',

'961': u'微电','962': u'未来','963': u'⽹络','968': u'⼼理','969': u'⼈⼯',

'970': u'纳⽶','971': u'艺术','972': u'光电','967': u'创新','973': u'核学',

'974': u'现代','975': u'化学','976': u'海洋','977': u'航空','979': u'杭州'

}

dept_ids_dict =dict([(v, k)for k, v in index_course.items()])

header_store =[

"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36",

"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36",

"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0",

"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.75.14 (KHTML, like Gecko) Versi

on/7.0.3 Safari/537.75.14",

"Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; Win64; x64; Trident/6.0)",

'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11',

'Opera/9.25 (Windows NT 5.1; U; en)',

'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)',

'Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Kubuntu)',

'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.12) Gecko/20070731 Ubuntu/dapper-security Firefox/1.5.0.12',

'Lynx/2.8.5rel.1 libwww-FM/2.14 SSL-MM/1.4.1 GNUTLS/1.2.9',

"Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.7 (KHTML, like Gecko) Ubuntu/11.04 Chromium/16.0.912.77 Chrome/16.0.912.77 Safari/535.7", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36",

"Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0 ",

'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'

]

class UCASEvaluate:

def__init__(self):

self.__read_from_course_id('./courseid')

cf = RawConfigParser()

self.username = cf.get('info','username')

self.password = cf.get('info','password')

self.runtime = cf.getint('info','runtime')

self.debug = cf.getboolean('action','debug')

self.evaluate = cf.getboolean('action','evaluate')

self.select_bat = cf.getboolean('action','select_bat')

self.watch_logo = cf.getboolean('action','watch_logo')

self.loginPage ='sep.ucas.ac'

self.loginUrl = self.loginPage +'/slogin'

self.selectCourseUrl ='jwjz.ucas.ac/Student/DesktopModules/Course/SelectCourse.aspx'

self.headers ={

'Host':'jwxk.ucas.ac',

'Connection':'keep-alive',

# 'Pragma': 'no-cache',

# 'Cache-Control': 'no-cache',

'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0. 9',

'Upgrade-Insecure-Requests':'1',

'User-Agent': header_store[-5],

'Accept-Encoding':'gzip, deflate',

'Accept-Language':'zh-CN,zh;q=0.9,en;q=0.8,zh-TW;q=0.7',

}

# self.headers = None

self.s = requests.Session()

(self.loginPage, headers=self.headers)

def dump_check(self, response, page_name='check'):

if self.debug:

with open('./{}.html'.format(page_name),'wb+')as f:

text = place('href="/static','href="static')

text = place('src="/static','src="static')

f.de('utf-8'))

def dump_here(self, response):

self.dump_check(response,'here')

@staticmethod

def show_http_request(url, data):

request_str ='{}'.format(url)

if data is not None:

request_str +='?'

request_str +='&'.join(['{}={}'.format(k, v)for k, v in data.items()])

return request_str

def show_response(self, response, url="", data=None, description=""):

if200<=int(response.status_code)<300:

status_str ="Link Success"

else:温度计的原理

status_str ="Link failed with code {}".format(int(response.status_code))

print('[{}] {}'.format(description, status_str))

加班费基数

if self.debug:

print("\tReq as {}".format(self.show_http_request(url, data)))

print("\tView as {}".format(response.url))

print("\tCookie: {}".format(_dict()))

def update_headers_with_cookie(self):

self.headers.update({'Cookie':';'.join(['{}={}'.format(k, v)for k, v in kies.items()])})

def session_get(self, url, data=None, desc=""):

response = (

url=url, data=data, headers=self.headers)

self.show_response(

response, url, data, description=desc)

self.update_headers_with_cookie()

return response

def session_post(self, url, data=None, desc=""):

response = self.s.post(

url=url, data=data, headers=self.headers)

self.show_response(

response, url, data, description=desc)

self.update_headers_with_cookie()

return response

def login(self):

post_data ={

'userName': self.username,

'pwd': self.password,

'sb':'sb'

}

response = self.s.post(

self.loginUrl, data=post_data, headers=self.headers)

self.show_response(response, self.loginUrl, post_data,'Login')

if'sepuser'in _dict():

return True

return False

@staticmethod

def get_message(restext):

css_soup = BeautifulSoup(restext,'html.parser')

text = css_soup.select('#main-content > div > div.m-cbox.m-lgray > -body > div')[0].text return"".join(line.strip()for line in text.split('\n'))

def__read_from_course_id(self, filename):

courses_file =open(filename,'rb')

大开头的成语print('[Loading CourseID]')

for line in adlines():

if isinstance(line,bytes):

line = line.decode('utf-8')

line = line.strip().replace(' ','').split(':')开早餐店

course_dept = dept_(line[0][:2])

print(line[1], line[0][:2],'ID:', course_dept)

course_id = line[1]

is_degree =False

if len(line)==3and line[2]=='on':

is_degree =True

print("")

def enrollCourses(self):

response = self.session_get(

urseSystem, desc='SEP AppStore')

soup = ,'html.parser')

identity = re.findall(r'"jwxk.ucas.ac/login\?Identity=(.*)&roleId=[0-9]{2,4}"',

str(soup))[0]

print("[Obtain Identity]", identity)

try:

慧智精品网

【选课脚本】用Python网页爬虫来进行选（qiang）课（更新至v1.0.8）

发表评论

推荐文章

【精品】人教版三年级数学下册期末复习知识点总结

会务人员工作总结范文(通用13篇)

关于战友情的演讲稿

关于对志愿军的描写和赞扬的作文100字

中考历史

热门文章

建党100周年是哪一年?

建军节回顾中国军队的伟大成就

建军节纪念中国人民解放军成立的日子

重温建军历程建军节回顾中国军队的奋斗历史

建军节庆祝中国军队的辉煌成就

建军节专题回顾中国军队的历史辉煌与伟大成就

建军节回顾中国军队的辉煌历史

纪念建军节见证中国国防事业的伟大成就与发展

纪念建军节回顾中国军队的辉煌历史与发展成就

八一建军节的历史背景

为热烈庆祝建军96周年

建军节相关知识和历史故事

2019年11月1日是建军多少周年

八一建军节的来历和由来

三年级数学下拓展题

...建设世界一流军队——热烈庆祝中国人民解放军建军90 周年

2021年是中国人民解放军建军多少周年

考研政治-建军90周意味着哪些政治考点

介绍建军96周年伟大成就和历史功勋

幼儿园大班基本知识100个常识

最新文章

关于对志愿军的描写和赞扬的作文100字

胡绳《中国共产党的七十年》配套模拟试题及详解【圣才出品】_百度文 ...

传媒从业者必备:2014年新闻月历

2023年全民国防教育知识网络竞赛考试模拟卷

人教版数学三年级上册 7单元(年月日)练习题

关于建军节的事迹简短100字

标签列表