作者:bigtrace
2020/12/05 日更新,我做了一个在线的解析bsk或者转bsk成明文的工具:http://www.baidubsk.site
仅供个人学习使用。
很久以前写了一个百度自动发帖机,结果由于水贴太快导致账号永封。最近将之前的代码改了一下,变成一个在命令行里面由python运行的百度贴吧客户端。
我简单介绍一下使用方法和功能
浏览首页
打开程序,它会自动读取你的百度帐号cookie文件,登录成功会显示islogin=1
,失败会显示islogin=0
。
接着你可以输入help
来浏览一下常用指令。
接下里可以有以下选择:
浏览自己关注的贴吧
输入指令 mf
浏览自己发过的主题贴
输入 mt
这里是爬 我的贴子 的页面
浏览自己最近的回帖
输入mr
这里是爬 我的回帖 的页面
浏览别人回复我的
输入指令 rm
这里是爬 回复我的 的页面
进入贴吧
输入a
加上贴吧名来进入
比如
先输入a
再输入 斗鱼TV
这里默认显示贴吧首页的帖子。但是如果你想浏览更早的帖子,输入 s
接着输入 页码
比如用 s 100
来浏览第100页的帖子。
这里的页数对应贴吧底部的页数。
这里index后面的数字要用来阅读帖子, 而括号里面的数字代表当前该帖的回复数。
最右边为发帖楼主ID
关注该贴吧
输入like
取消关注
输入dislike
签到
输入si
阅读某个帖子
输入帖子序号
每个帖子都有对应的index,如果对某个帖子感兴趣,可以直接输入 t index
。
比如我对第20 个帖子感兴趣,就可以type t 20
输入帖子的url
如果你已知帖子的url,以格式t url
输入:
例如
程序会抓取此帖所有回复,并且标注每层楼的发帖ID及吧内等级,发帖客户端及发帖时间,具体某层楼的回帖数
如果某一楼层是一个视频链接,则程序会将其下载链接显示出来。 如果含有图片,则显示图片url。
如果有些帖子很长,比如好几百页回复,那么程序会逐个遍历所有页面。
只看楼主
输入 zklz
效果如下:
展开楼中楼
当你想 查看具体某一楼层的楼中楼回复时,可以使用指令 例如 lzl 25
程序会自动将第25楼内的所有楼中楼内回帖展示出来。
对应帖子
注意的一点是,由于百度最近新增改昵称功能,但是程序会自动显示用户的原始ID。所以会与网页版的ID有所出入。
如果想换贴吧浏览,则输入a
,接着输入你想看的其他贴吧名称
发帖
如果想发帖的话,可以输入 p
然后根据提示,输入帖子的标题和内容。
比如下图
发帖成功,如下图所示
回帖
当然,回帖功能是必不可少的。
直接回复
输入r
回复最近浏览的贴子。并且在你回复的内容后自动添加小尾巴。小尾巴可以根据个人喜好自定义内容。我喜欢在签名档中添加图片和一些随机文字,这样可以水到更多经验。另外你也可以设置签名档,在程序内post表栏目中'sign_id':sign_id 更改。
比如我在之前的帖子里面回复: r
首先程序会提示是否需要插入图片,如果需要则直接按照要求填写图片url,如果不需要,则直接按回车跳过。
本地图片可以先上传到图床然后制作url,然后插入。 gif 图会有些限制,比如width小于530,size小于3MB
楼中楼内回复
如果你不想回复楼主而是回复具体某个楼层时,你可以直接输入 r floor_num
比如说你想回帖子的第2楼
直接输入 r 2
, 2代表楼层数。
发帖成功后的效果图为
这里 r 2
中间打多少个空格都无所谓,另外楼中楼时不显示小尾巴和图片。
另外其他的常用指令
f
刷新首页的帖子
ft
刷新当前帖子的回复,比如你刚用r
回复了某一个帖子, 通过ft
刷新当前帖子,就立即看到自己刚回复的内容
b
跳到首页的帖子列表
c
清屏
e
退出
登录帐号
这里我用了chrome一个专门导入导出cookie的extention
在登录完你的百度帐号后,随便进入一个百度帖子页面,然后该利用这个extention导出cookie(json format)然后保存在cookie.txt文件中。
该程序会自动读取该文件来实现登录功能。由于百度贴吧内部验证流程越来越复杂,我实在弄不好如何直接输入用户名和密码还有验证码实现登录,所以这里用cookie文件实属取巧,不过对于我个人来说已经够用了。如果有了解如何登陆的朋友,请不吝赐教。
附上代码,由于代码没啥复杂的东西,所以有问题就留言哈
我用了pycurl 的库,主要是为了对付我公司的NTLM验证。如果你公司没有proxy 代理,则可以把下面几行注释掉
self.c.setopt(pycurl.PROXY, 'http://192.168.87.15:8080')
self.c.setopt(pycurl.PROXYUSERPWD, 'LL66269:123456789')
self.c.setopt(pycurl.PROXYAUTH, pycurl.HTTPAUTH_NTLM)
找出 加入的 BSK 参数 8/29/2017 更新
找到了github上一个大神的repo
https://link.zhihu.com/?target=https%3A//github.com/8qwe24657913/Analyze_baidu_BSK
他已经给出了比较详细的解决方法,我在这里只能膜拜。
这里我就说说怎么用python来运行他那一段deobf.js
的javascript。
首先我尝试了很多 运行JS的Python库,比如说 js2py, selenium,pyv8, 可惜都出错。具体原因我也不是特别清楚,有些js语句在控制台可以运行,放到pythonn里面就会报错。
所以这里我用了最保险的方法,用selenium调用firefox的webdriver去计算这段js。
接着用算出来的BSK带入Post中,就可以成功发帖了。
在命令窗口里面更换你喜欢的颜色
添加颜色可以使你的小程序看起来更加舒服。
利用colorama termcolor
这两个库可以轻松更改任意一段string的显示颜色。
比如说 colored(u"楼主" ,'red')
就可以用红色字体显示楼主二字。
新增图片预览功能
输入pic
,可以产生一个由PyQt写的一个小窗口,用来预览该帖中的所有图片,并且会在console里面输出当前图片所属的楼层,以及该楼层包含图片的数目。
在代码中,我写了一个叫Pic_Viewer的类去实现。
该窗口包括四个按钮:NF 表示下翻到下一个包含图片的楼层,PF 表示上翻到上一个包含图片的楼层,N 表示预览该楼层内的下一张图片,P 表示预览该楼层内的上一张图片。
下面源码 (python 2.7)
依赖的第三方库有:
pycurl, lxml,PIL,colorama,termcolor,selenium,PyQt4
主程序如下, 另外回帖和发帖功能还需你的百度帐号cookie文件(json 格式) 和firefox的webdriver(geckodriver.exe)才能实现。
# coding=utf-8
import time
import pycurl
import os.path
import sys, locale
import random
from random import randint
import urllib
from urllib import urlencode, quote_plus
from StringIO import StringIO
import json
from pprint import pprint
import re
import lxml.html
import codecs
from HTMLParser import HTMLParser
import unicodedata
from PIL import Image
from colorama import Fore, Back, Style,init
from termcolor import colored
from selenium import webdriver
from PyQt4.QtGui import *
from PyQt4 import QtGui
# class definition
class Pic_Viewer(QtGui.QWidget):
def __init__(self,all_pic_list):
super(Pic_Viewer, self).__init__()
#self.url_list=['http://static.cnbetacdn.com/article/2017/0831/8eb7de909625140.png','http://static.cnbetacdn.com/article/2017/0831/7f11d5ec94fa123.png','http://static.cnbetacdn.com/article/2017/0831/1b6595175fb5486.jpg']
self.url_dict=all_pic_list
self.url_floor_num=self.url_dict.keys()
self.url_floor_num.sort(key=lambda x: int(x),reverse=False) # sort the key list by floor number
self.current_pic_floor=0
self.current_pic_index=0
self.initUI()
#time.sleep(5)
def initUI(self):
QtGui.QToolTip.setFont(QtGui.QFont('Test', 10))
self.setToolTip('This is a <b>QWidget</b> widget')
# Show image
self.pic = QtGui.QLabel(self)
self.pic.setGeometry(0, 0, 600, 500)
#self.pic.setPixmap(QtGui.QPixmap("/home/lpp/Desktop/image1.png"))
pixmap = QPixmap()
data=self.retrieve_from_url(self.url_dict[self.url_floor_num[0]][0])
pixmap.loadFromData(data)
self.pic.setPixmap(pixmap)
#self.pic.setPixmap(QtGui.QPixmap.loadFromData(data))
# Show button
btn_next_same_floor = QtGui.QPushButton('N', self)
btn_next_same_floor.setToolTip('This is a <b>QPushButton</b> widget')
btn_next_same_floor.resize(btn_next_same_floor.sizeHint())
btn_next_same_floor.clicked.connect(self.fun_next_pic_same_floor)
btn_next_same_floor.move(400, 0)
btn_prev_same_floor = QtGui.QPushButton('P', self)
btn_prev_same_floor.setToolTip('This is a <b>QPushButton</b> widget')
btn_prev_same_floor.resize(btn_prev_same_floor.sizeHint())
btn_prev_same_floor.clicked.connect(self.fun_prev_pic_same_floor)
btn_prev_same_floor.move(100, 0)
btn_next_floor = QtGui.QPushButton('NF', self)
btn_next_floor.setToolTip('This is a <b>QPushButton</b> widget')
btn_next_floor.resize(btn_next_floor.sizeHint())
btn_next_floor.clicked.connect(self.fun_next_floor)
btn_next_floor.move(500, 0)
btn_prev_floor = QtGui.QPushButton('PF', self)
btn_prev_floor.setToolTip('This is a <b>QPushButton</b> widget')
btn_prev_floor.resize(btn_prev_floor.sizeHint())
btn_prev_floor.clicked.connect(self.fun_prev_floor)
btn_prev_floor.move(0, 0)
self.setGeometry(300, 300, 600, 500)
self.setWindowTitle('ImgViewer')
self.show()
self.print_current_location()
def retrieve_from_url(self,pic_url):
c = pycurl.Curl()
c.setopt(pycurl.PROXY, 'http://192.168.87.15:8080')
c.setopt(pycurl.PROXYUSERPWD, 'LL66269:')
c.setopt(pycurl.PROXYAUTH, pycurl.HTTPAUTH_NTLM)
buffer = StringIO()
c.setopt(pycurl.URL, pic_url)
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()
data = buffer.getvalue()
return data
def print_current_location(self):
sys.stdout.write('\r')
sys.stdout.write("[ %sL ] %s (%d)" % (self.url_floor_num[self.current_pic_floor], str(self.current_pic_index+1),len(self.url_dict[self.url_floor_num[self.current_pic_floor]])))
sys.stdout.flush()
# Connect button to image updating
def fun_next_pic_same_floor(self):
if len(self.url_dict[self.url_floor_num[self.current_pic_floor]])>1:
if self.current_pic_index < len(self.url_dict[self.url_floor_num[self.current_pic_floor]])-1:
self.current_pic_index=self.current_pic_index+1
else:
self.current_pic_index=0
pixmap = QPixmap()
data=self.retrieve_from_url(self.url_dict[self.url_floor_num[self.current_pic_floor]][self.current_pic_index])
pixmap.loadFromData(data)
self.pic.setPixmap(pixmap)
#self.pic.setPixmap(QtGui.QPixmap( "/home/lpp/Desktop/image2.png"))
self.print_current_location()
def fun_prev_pic_same_floor(self):
if len(self.url_dict[self.url_floor_num[self.current_pic_floor]])>1:
if self.current_pic_index > 0:
setoptelf.current_pic_index=self.current_pic_index-1
else:
self.current_pic_index=len(self.url_dict[self.url_floor_num[self.current_pic_floor]])-1
pixmap = QPixmap()
data=self.retrieve_from_url(self.url_dict[self.url_floor_num[self.current_pic_floor]][self.current_pic_index])
pixmap.loadFromData(data)
self.pic.setPixmap(pixmap)
self.print_current_location()
def fun_next_floor(self):
if self.current_pic_floor < len(self.url_floor_num)-1:
self.current_pic_floor=self.current_pic_floor+1
else:
self.current_pic_floor=0
self.current_pic_index=0
pixmap = QPixmap()
data=self.retrieve_from_url(self.url_dict[self.url_floor_num[self.current_pic_floor]][self.current_pic_index])
pixmap.loadFromData(data)
self.pic.setPixmap(pixmap)
self.print_current_location()
def fun_prev_floor(self):
if self.current_pic_floor > 0:
self.current_pic_floor=self.current_pic_floor-1
else:
self.current_pic_floor=len(self.url_floor_num)-1
self.current_pic_index=0
pixmap = QPixmap()
data=self.retrieve_from_url(self.url_dict[self.url_floor_num[self.current_pic_floor]][self.current_pic_index])
pixmap.loadFromData(data)
self.pic.setPixmap(pixmap)
self.print_current_location()
#---------------------------------------------
# class definition
class Browser_tieba:
mouse_pwd_fix="27,17,15,26,21,19,16,42,18,15,19,15,18,15,19,15,18,15,19,15,18,15,19,15,18,15,19,42,17,18,27,22,18,42,18,26,17,19,15,18,19,27,19,"
zklz=False
pid_floor_map={}
tiebaName_utf=""
tiebaName_url=""
tiezi_link=[]
shouye_index=1
shouye_titles=[]
last_viewed_tiezi_index=0
current_view_tiezi_link=""
tail=""
#u"[br][br][br]---来自百度贴吧Python客户端[br][br]"
#[url]http://www.jianshu.com/p/11b085d326c2[/url]
#[emotion pic_type=1 width=30 height=30]//tb2.bdstatic.com/tb/editor/images/face/i_f25.png?t=20140803[/emotion]
c = pycurl.Curl()
def __init__(self):
self.Load_cookie()
#self.read_source()
Welcome=u"""
_ _ _ _ _______ _ _
| || || | | | _ (_______)(_) | |
| || || | ____ | | ____ ___ ____ ____ | |_ ___ _ _ ____ | | _ ____
| ||_|| | / _ )| | / ___) / _ \ | \ / _ ) | _) / _ \ | | | | / _ )| || \ / _ |
| |___| |( (/ / | |( (___ | |_| || | | |( (/ / | |__ | |_| | | |_____ | |( (/ / | |_) )( ( | |
\______| \____)|_| \____) \___/ |_|_|_| \____) \___) \___/ \______)|_| \____)|____/ \_||_|
_ _ _
_ | | | |(_) _
___ ___ ____ _ _ | |_ | | _ ___ ____ ____ | | _ ____ ____ | |_
(___)(___) | _ \ | | | || _) | || \ / _ \ | _ \ / ___)| || | / _ )| _ \ | _)
| | | || |_| || |__ | | | || |_| || | | | ( (___ | || |( (/ / | | | || |__
| ||_/ \__ | \___)|_| |_| \___/ |_| |_| \____)|_||_| \____)|_| |_| \___)
|_| (____/
简书: 用python写一个百度贴吧客户端
http://www.jianshu.com/p/11b085d326c2
by bigtrace
"""
print Welcome
def solve_bsk(self,tbs_str):
driver = webdriver.Firefox()
bsk_js_1="""
function bsk_solver(tbs_str) {
var MAP = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/<$+-%>{:* \\,}[^=_~&](")';
var IN=tbs_str; // this is tbs
var OUT={};
function encodeStr(str) {
var res = [];
for (var i = 0; i < str.length; i++) {
res.push(3 + (5 ^ str.charCodeAt(i)) ^ 6)
}
return res
}
function decodeCharCode(code) {
return (6 ^ code) - 3 ^ 5
}
function toCharCodeArr(str) {
var res = [];
for (var i = 0; i < str.length; i++) {
res.push(str.charCodeAt(i))
}
return res
}
function decodeChar(code) {
return String.fromCharCode(decodeCharCode(code))
}
function decodeStr(arr) {
return map(flatten(arr), decodeChar).join('')
}
function fromCharCodes(charCodes) {
return String.fromCharCode.apply(null, charCodes)
}
function map(arr, func) {
var res = [];
for (var i = 0; i < arr.length; i++) {
res.push(func(arr[i], i))
}
return res
}
function isArr(wtf) {
return wtf.push && 0 === wtf.length || wtf.length
}
function flatten(arr) {
return isArr(arr) ? [].concat.apply([], map(arr, flatten)) : arr
}
function genRes(arr, map) {
for (var i = 0; i < arr.length; i++) {
arr[i] = decodeCharCode(arr[i]);
arr[i] = arr[i] ^ map[i % map.length]
}
return arr
}
function nextFunc(funcs) {
var index = Math.floor(Math.random() * funcs.length);
return funcs.splice(index, 1)[0]
}
function startRun() {
var isNodejs = false;
try {
isNodejs = Boolean(global.process || global.Buffer)
} catch (n) {
isNodejs = false
}
if (isNodejs) {
var wtf = decodeStr(toCharCodeArr(MAP)); // bug: quote isn't escaped
func = function () {
var [key, func] = nextFunc(funcs);
return `"${key}":""${wtf}"` // bug: duplicate quotes
}
} else {
func = function () {
var [key, func] = nextFunc(funcs);
try {
var res = func();
if (res && res.charCodeAt) {
res = res.replace(/"/g, encodeStr('\\"')); // bug: encoded twice
return `"${key}":"${res}"`;
} else return `"${key}": ${res.toString()}`
} catch (n) {
return `"${key}": 20170511`
}
}
}
var length = funcs.length;
var str = `{${Array.from({length}).map(func).join()}}`;
console.log(str);
if (!isNodejs) {
var charCodes = genRes(encodeStr(str), [94, 126, 97, 99, 69, 49, 36, 43, 69, 117, 51, 95, 97, 76, 118, 48, 106, 103, 69, 87, 90, 37, 117, 55, 62, 77, 103, 38, 69, 53, 70, 80, 81, 48, 80, 111, 51, 73, 68, 125, 117, 51, 93, 87, 100, 45, 42, 105, 73, 40, 95, 52, 126, 80, 56, 71]);
var data = btoa(fromCharCodes(charCodes));
OUT.data = data
} else OUT.data = btoa(fromCharCodes(str))
//console.log(OUT);
}
var funcs = [['p1', function () {
return window.encodeURIComponent(window.JSON.stringify(IN))
}], ['u1', function () {
return "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"
}], ['l1', function () {
return "en-US"
}], ['s1', function () {
return 1080
}], ['s2', function () {
return 1920
}], ['w1', function () {
return "NULL"
}], ['w2', function () {
return "NULL"
}], ['a1', function () {
return 1920
}], ['a2', function () {
return 1040
}], ['s3', function () {
return true
}], ['l2', function () {
return true
}], ['i1', function () {
return true
}], ['a3', function () {
return false
}], ['p2', function () {
return "Win32"
}], ['d1', function () {
return "NULL"
}], ['c1', function () {
return true
}], ['a4', function () {
return false
}], ['p3', function () {
return false
}], ['n1', function () {
return 20170511
}], ['w3', function () {
return false
}], ['e1', function () {
return 20170511
}], ['n2', function () {
return 20170511
}], ['n3', function () {
return 20170511
}], ['r1', function () {
return "function random() { [native code] }"
}], ['t1', function () {
return "function toString() { [native code] }"
}], ['w4', function () {
return "stop,open,alert,confirm,prompt,print,requestAnimationFrame,cancelAnimationFrame,requestIdleCallback,cancelIdleCallback,captureEvents,releaseEvents,getComputedStyle,matchMedia,moveTo,moveBy,resizeTo,resizeBy,getSelection,find,getMatchedCSSRules"
}], ['t2', function () {
return Math.floor(Date.now() / 1000)
}], ['m1', function () {
return 'basilisk_aLv0jg'
}]];
startRun();
return OUT.data
}
return bsk_solver('"""
bsk_js_full=bsk_js_1+tbs_str+"')"
BSK_=driver.execute_script(bsk_js_full)
driver.quit()
return BSK_
def read_source(self):
fname_pic="pic_tail.txt"
with open(fname_pic) as f:
self.img_list = f.read().splitlines()
f.close()
fname_wenzi="wisdom_tail.txt"
with codecs.open(fname_wenzi, "r", "utf-8") as f1:
self.widsom_list = f1.read().splitlines()
f1.close()
def return_random_tail(self):
tail_append=""
# widsom in tail
while True:
list_widsom_index=randint(0,len(self.widsom_list)-1)
data = "".join(self.widsom_list[list_widsom_index].split())
if len(data)>80:
break
tail_append=tail_append+"[br]"+data+"[br]"
#image in tail
img_index=randint(0,len(self.img_list)-1)
tail_append=tail_append+"[br]"+self.Get_size_of_url_img(self.img_list[img_index])+"[br]"
# emoji in tail
list_emoji=random.sample(xrange(1,70), 0)
emoji_head="[emotion pic_type=1 width=30 height=30]https://gsp0.baidu.com/5aAHeD3nKhI2p27j8IqW0jdnxx1xbK/tb/editor/images/client/image_emoticon"
emoji_tail=".png[/emotion] "
for each_id in list_emoji:
tail_append=tail_append+emoji_head+str(each_id)+emoji_tail
return self.tail+tail_append
def return_fix_tail(self):
tail_append="" #"[img pic_type=0 width=560 height=322]https://imgsa.baidu.com/forum/pic/item/ec4fa635e5dde711e2edc463adefce1b9d166111.jpg[/img]"
return self.tail
def return_tail(self):
return self.return_fix_tail()
def change_tieba(self):
htmlparser = HTMLParser()
self.tiebaName_utf = raw_input('Type tieba Name\n').decode(sys.stdin.encoding or locale.getpreferredencoding(True))
self.tiebaName_url=urllib.quote(self.tiebaName_utf.encode('utf-8')) # encode utf-8 as url
self.shouye(1)
def shouye(self,page):
print "************Shouye Layer "+ str(page) +"************"
website = unicode('http://tieba.baidu.com/f?kw='+ self.tiebaName_url +'&ie=utf-8&pn=')
link = website + unicode(str((page-1)*50))
print "url="+link+"\n"
buffer = StringIO()
self.c.setopt(pycurl.URL, link)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
body = buffer.getvalue().decode('utf-8', 'ignore')
self.fid = re.search(r"forum = {\s+'id': (\d+),", body).group(1)
self.tbs = re.search(r"tbs.*\"(.*)\"", body).group(1)
self.name_url = re.search(r"'name_url':\s+\"(.*)\"", body).group(1)
doc = lxml.html.fromstring(body)
posts_datafields = doc.xpath("//li[contains(@class, ' j_thread_list')]/@data-field")
links = doc.xpath("//a[@class='j_th_tit ']/@href")
titles = doc.xpath("//a[@class='j_th_tit ']/@title")
Header_list=[]
self.header_max_width=12
self.title_max_width=70
i=0
for each_title in titles:
each_floors_data_field_json = json.loads(posts_datafields[i])
poster=each_floors_data_field_json['author_name']
poster_str=urllib.unquote(poster)
reply_num=each_floors_data_field_json['reply_num']
Header="index "+colored(str(i),'magenta')
Tail=colored("--"+poster_str+ " ("+ str(reply_num)+") ",'cyan')
each_title=": "+each_title
Header_list.append([Header,each_title,Tail])
Header_fmt= u'{0:<%s}' % (self.header_max_width - self.wide_chars(Header))
Title_fmt= u'{0:<%s}' % (self.title_max_width - self.wide_chars(each_title))
try:
print (Header_fmt.format(Header)+Title_fmt.format(each_title) + Tail).encode("gb18030")
except:
print (Header_fmt.format(Header)+"Title can't be displayed").encode("gb18030")
print ""
i=i+1
self.tiezi_link=links
self.shouye_titles=Header_list
print "\n---------------------"
def representInt(self,s):
try:
int(s)
return True
except ValueError:
return False
def go_into_each_post(self,index):
self.pid_floor_map={}
self.author_floor_map={}
self.content_floor_map={}
self.current_thread_img_list={}
if self.representInt(index):
each_post_link='https://tieba.baidu.com'+self.tiezi_link[index]+"?pn=1"
else:
each_post_link=index+'?pn=1' # for specific url
self.current_view_tiezi_link=each_post_link
buffer = StringIO()
self.c.setopt(pycurl.URL, each_post_link)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
body = buffer.getvalue().decode('utf-8', 'ignore')
#----------------- get fid tbs tid for future reply purpose
doc = lxml.html.fromstring(body)
script = doc.xpath('//script[contains(., "PageData")]/text()')[0]
data = re.search(r"{(.*)UTF", script).group(1)
self.fid=re.search(r"fid:'(\d+)'", body).group(1)
data="{"+data+"\"}"
tbs_json = json.loads(data)
self.tbs=tbs_json["tbs"] # get tbs
self.tid=re.search(r"p/(\d+)", self.current_view_tiezi_link).group(1)
self.tiebaName_utf=re.search(r"forumName':\s+'(.*)',", body).group(1)
#------------------
title=doc.xpath("//*[self::h1 or self::h2 or self::h3][contains(@class, 'core_title_txt')]")
try:
print ("\n\n"+self.tiebaName_utf +" >> "+title[0].text_content()).encode("gb18030")
except:
print "************Tiezi : title can't be displayed************"
#print each_post_link
#--- get how many pages in total
pager = doc.xpath("//li[@class='l_pager pager_theme_5 pb_list_pager']/a/@href")
if pager:
last_page=pager[-1]
last_page_number = re.search(r"pn=(\d+)", last_page).group(1)
page_list=range(1,int(last_page_number)+1)
for each_page_num in page_list:
#print each_page_num
self.view_each_post_by_pages(index,each_page_num)
else:
self.view_each_post_by_pages(index,1)
print "************no more replies************"
def view_each_post_by_pages(self,index,page_number):
if self.representInt(index):
each_post_link='https://tieba.baidu.com'+self.tiezi_link[index]+"?pn="+str(page_number)
else:
each_post_link=index+"?pn="+str(page_number) # for specific url
buffer = StringIO()
self.c.setopt(pycurl.URL, each_post_link)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
body = buffer.getvalue().decode('utf-8', 'ignore')
doc = lxml.html.fromstring(body)
print "\n_____________ Page "+str(page_number)+": "+ colored(each_post_link,'blue') + " _____________\n"
p_postlist_All=doc.xpath("//div[@id='j_p_postlist']/div[contains(@class, 'l_post')]")
p_postlist_datafield=doc.xpath("//div[@id='j_p_postlist']/div[contains(@class, 'l_post')]/@data-field")
i=0
max_width = 20 # Align chinese character:fix width : change this number to fit your screen
for each_data_field in p_postlist_datafield:
each_floors_data_field_json = json.loads(each_data_field)
poster=each_floors_data_field_json['author']['user_name']
post_no=each_floors_data_field_json['content']['post_no']
post_id=each_floors_data_field_json['content']['post_id']
post_comment_num=each_floors_data_field_json['content']['comment_num']
if 'level_id' not in each_floors_data_field_json['author']:
# This is for tieba old DOM structure
if not p_postlist_All[i].xpath(".//div[@class= 'd_badge_lv']"):
i=i+1
continue # this floor is AD
else:
poster_level=p_postlist_All[i].xpath(".//div[@class= 'd_badge_lv']")[0].text_content()
post_content=p_postlist_All[i].xpath(".//div[contains(@class, 'd_post_content j_d_post_content')]")[0].text_content().lstrip()
if len(p_postlist_All[i].xpath(".//span[@class= 'tail-info']"))>2:
post_open_type=p_postlist_All[i].xpath(".//span[@class= 'tail-info']")[0].text_content().lstrip()
else:
post_open_type=""
post_date=p_postlist_All[i].xpath(".//span[@class= 'tail-info']")[-1].text_content()
else:
# This is for tieba new DOM structure
poster_level=each_floors_data_field_json['author']['level_id'] # only in Chouxiang tieba has this option
post_date=each_floors_data_field_json['content']['date']
post_open_type=each_floors_data_field_json['content']['open_type']
post_content=p_postlist_All[i].xpath(".//div[contains(@class, 'd_post_content ')]")[0].text_content().lstrip() #@class= 'd_post_content j_d_post_content clearfix'
self.pid_floor_map[str(post_no)]=str(post_id)
self.author_floor_map[str(post_no)]=poster
if self.zklz==True:
if poster != self.author_floor_map['1']:
i=i+1
continue
poster_str=colored(urllib.unquote(poster),'green') # transform from urlencode string to utf-8 string
if poster == self.author_floor_map['1']:
poster_str=poster_str+ " <"+str(poster_level)+u"> "+ colored(u"楼主" ,'red') # add poster level
else:
poster_str=poster_str+ " <"+str(poster_level)+">" # add poster level
post_if_img=p_postlist_All[i].xpath(".//div[contains(@class, 'd_post_content')]/img[@class='BDE_Image']/@src")
post_if_video=p_postlist_All[i].xpath(".//embed/@data-video")
if post_if_video:
post_content=post_content+ colored("\n<video url: " + post_if_video[0] +" >",'yellow')
if post_if_img:
#print "img detected!"
img_list=[]
for each_img_src in post_if_img:
post_content=post_content+ colored("\n<img url: " + each_img_src +" >",'yellow')
img_list.append(each_img_src)
#pprint(img_list)
self.current_thread_img_list[str(post_no)]=img_list
poster_fmt= u'{0:<%s}' % (max_width - self.wide_chars(poster_str))
content=""
try:
if post_comment_num>0:
content=colored(str(post_no),'cyan')+ "L : "+ poster_fmt.format(poster_str) +" : "+ post_content + u" 回复("+colored(str(post_comment_num),'green')+")"
else:
content=colored(str(post_no),'cyan')+ "L : "+ poster_fmt.format(poster_str) +" : "+ post_content
except:
content=colored(str(post_no),'cyan')+ "L : "+ poster_fmt.format(poster_str)+ " can't be displayed "
print (content).encode("gb18030")
self.content_floor_map[str(post_no)]=content
if post_open_type=="apple":
print "iphone" + " " + post_date
elif post_open_type=="":
print "PC" + " " + post_date
elif post_open_type=="android":
print post_open_type + " " + post_date
else:
print post_open_type + " " + post_date
i=i+1
print "-----\n"
def Load_cookie(self):
with open('cookie.txt') as data_file:
data = json.load(data_file)
chunks = []
for cookie_each_element in data:
name, value,domain = cookie_each_element['name'], cookie_each_element['value'],cookie_each_element['domain']
name = quote_plus(name)
value = quote_plus(value)
chunks.append('%s=%s;domain=%s;' % (name, value,domain))
self.c.setopt(pycurl.PROXY, 'http://192.168.87.15:8080')
self.c.setopt(pycurl.PROXYUSERPWD, 'LL66269:')
self.c.setopt(pycurl.PROXYAUTH, pycurl.HTTPAUTH_NTLM)
self.c.setopt(self.c.FOLLOWLOCATION, 1)
self.c.setopt(pycurl.VERBOSE, 0)
self.c.setopt(pycurl.FAILONERROR, True)
self.c.setopt(pycurl.COOKIE, ''.join(chunks))
#------------------- Need to use each post page's own cookie to login
url_tbs = 'http://tieba.baidu.com/dc/common/tbs'
buffer = StringIO()
self.c.setopt(pycurl.URL, url_tbs)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
body=buffer.getvalue()
dic_json = json.loads(body)
print "islogin="+ str(dic_json['is_login']) # check if logged in
def Reply_to_floor(self,floor_num):
content = raw_input('you replied:\n').decode(sys.stdin.encoding or locale.getpreferredencoding(True))
pid=self.pid_floor_map[str(floor_num)]
kw=self.tiebaName_utf
data_form = {
'ie': 'utf-8',
'kw': kw.encode("utf-8"),
'fid': self.fid,
'tid': self.tid,
'floor_num':floor_num,
'quote_id':pid,
'rich_text':'1',
'lp_type':'0',
'lp_sub_type':'0',
'tag':'11',
'content': content.encode("utf-8"),
'tbs': self.tbs,
'basilisk':'1',
'new_vcode':1,
'repostid':pid,
'anonymous':'0',
'_BSK' : self.solve_bsk(self.tbs)
}
#pprint (data_form)
buffer = StringIO()
data_post = urllib.urlencode(data_form)
url = 'https://tieba.baidu.com/f/commit/post/add'
self.c.setopt(pycurl.URL, url)
self.c.setopt(pycurl.POST, 1)
self.c.setopt(pycurl.POSTFIELDS, data_post)
self.c.setopt(self.c.WRITEFUNCTION, buffer.write)
self.c.setopt(pycurl.VERBOSE, 0)
self.c.perform()
response = buffer.getvalue() #here we got the response data
response_json = json.loads(response)
is_succeed=response_json["no"] # get tbs
if is_succeed==0:
print "comment successfully!"
else:
pprint (response_json)
print "comment failed!"
def lzl_more(self,floor_num):
print ("\n\n"+self.content_floor_map[str(floor_num)]+"\n").encode("gb18030")
pid=self.pid_floor_map[str(floor_num)]
lzl_more_url="https://tieba.baidu.com/p/comment?tid="+self.tid+"&pid="+ pid +"&pn=1"
buffer = StringIO()
self.c.setopt(pycurl.URL, lzl_more_url)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
body = buffer.getvalue().decode('utf-8', 'ignore')
doc = lxml.html.fromstring(body)
lzl_single_post = doc.xpath('//li[contains(@class, "lzl_single_post")]')
max_width=20
for each_lzl in lzl_single_post:
lzl_user=each_lzl.xpath('.//a[contains(@class, "at j_user_card ")]')[0].text
lzl_content_main=each_lzl.xpath('.//span[contains(@class, "lzl_content_main")]')[0].text_content()
poster_fmt= u'{0:<%s}' % (max_width - self.wide_chars(lzl_user))
print (poster_fmt.format(lzl_user)+" : "+lzl_content_main).encode("gb18030")
lzl_page = doc.xpath('//li[contains(@class, "lzl_li_pager")]/@data-field')[0]
page_json = json.loads(lzl_page)
total_page=page_json["total_page"] # get lzl tatal page
count=1
#print "\n<total_page:"+str(total_page)+">"
if total_page>1:
while (count<total_page):
count=count+1
lzl_more_url="https://tieba.baidu.com/p/comment?tid="+self.tid+"&pid="+ pid +"&pn=" +str(count)
#print lzl_more_url
buffer = StringIO()
self.c.setopt(pycurl.URL, lzl_more_url)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
body = buffer.getvalue().decode('utf-8', 'ignore')
doc = lxml.html.fromstring(body)
lzl_single_post = doc.xpath('//li[contains(@class, "lzl_single_post")]')
for each_lzl in lzl_single_post:
lzl_user=each_lzl.xpath('.//a[contains(@class, "at j_user_card ")]')[0].text
lzl_content_main=each_lzl.xpath('.//span[contains(@class, "lzl_content_main")]')[0].text_content()
poster_fmt= u'{0:<%s}' % (max_width - self.wide_chars(lzl_user))
print (poster_fmt.format(lzl_user)+" : "+lzl_content_main).encode("gb18030")
print "------------------------------"
def view_image(self):
print "launch picture viewer..."
viewer_app = QtGui.QApplication(sys.argv)
ex = Pic_Viewer(self.current_thread_img_list)
sys.exit(viewer_app.exec_())
def Make_New_Post(self):
title = raw_input('your title:\n').decode(sys.stdin.encoding or locale.getpreferredencoding(True))
content = raw_input('your content:\n').decode(sys.stdin.encoding or locale.getpreferredencoding(True))
url_img = raw_input('any img url to insert?\n').decode(sys.stdin.encoding or locale.getpreferredencoding(True))
if url_img !="":
url_img_upload=self.Get_size_of_url_img(url_img)
content=url_img_upload+"[br]"+content+self.return_tail()
else:
content=content+self.return_tail()
#----
shouye_link=website = unicode('http://tieba.baidu.com/f?kw='+ self.tiebaName_url +'&ie=utf-8&pn=1')
buffer = StringIO()
self.c.setopt(pycurl.URL, shouye_link)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
body=buffer.getvalue().decode('utf-8', 'ignore')
tree = lxml.html.fromstring(body)
fid=re.search(r"forum = {\s+'id'\:\s+(\d+)", body).group(1)
tbs=re.search(r"tbs\': \"(\w+)\"", body).group(1) # get tbs
kw=self.tiebaName_utf
mouse_pwd_t=str(int(time.time()))
mouse_pwd=mouse_pwd_t+'0'
mouse_pwd=self.mouse_pwd_fix+mouse_pwd
# using signature
signature=[{'id':15309379,'name':'西财'},{'id':43817160,'name':'早乙女1'},{'id':43817169,'name':'早乙女2'},{'id':24324097,'name':'ubw'},{'id':43817177,'name':'早乙女3'}]
id_=randint(0,len(signature)-1)
sign_id=signature[id_]['id']
#
data_form = {
'ie': 'utf-8',
'kw': kw.encode("utf-8"),
'fid': fid,
'tid': '0',
'content': content.encode("utf-8"),
'title':title.encode("utf-8"),
'rich_text': '1',
'tbs': tbs,
'floor_num':'0',
'sign_id':sign_id,
'mouse_pwd':mouse_pwd,
'mouse_pwd_t':mouse_pwd_t,
'__type__': 'thread',
'mouse_pwd_isclick':'0',
'_BSK' : self.solve_bsk(tbs)
}
buffer = StringIO()
data_post = urllib.urlencode(data_form)
url = 'http://tieba.baidu.com/f/commit/thread/add'
self.c.setopt(pycurl.URL, url)
self.c.setopt(pycurl.POST, 1)
self.c.setopt(pycurl.POSTFIELDS, data_post)
self.c.setopt(self.c.WRITEFUNCTION, buffer.write)
self.c.perform()
response = buffer.getvalue() #here we got the response data
response_json = json.loads(response)
is_succeed=response_json["no"] # get tbs
if is_succeed==0:
print "post successfully!"
else:
print "post failed!"
def Reply_this_post(self):
url_img = raw_input('any img url to insert?\n').decode(sys.stdin.encoding or locale.getpreferredencoding(True))
content = raw_input('you replied:\n').decode(sys.stdin.encoding or locale.getpreferredencoding(True))
if url_img !="":
url_img_upload=self.Get_size_of_url_img(url_img)
content=content+"[br]"+url_img_upload+self.return_tail()
#print content
else:
content=content+self.return_tail()
kw=self.tiebaName_utf
mouse_pwd_t=str(int(time.time()))
mouse_pwd=mouse_pwd_t+'0'
mouse_pwd=self.mouse_pwd_fix+mouse_pwd
# using signature
signature=[{'id':15309379,'name':'西财'},{'id':43817160,'name':'早乙女1'},{'id':43817169,'name':'早乙女2'},{'id':24324097,'name':'ubw'},{'id':43817177,'name':'早乙女3'}]
id_=randint(0,len(signature)-1)
sign_id=signature[id_]['id']
data_form = {
'ie': 'utf-8',
'kw': kw.encode("utf-8"),
'fid': self.fid,
'tid': self.tid,
'content': content.encode("utf-8"),
'is_login': '1',
'rich_text': '1',
'tbs': self.tbs,
'sign_id':sign_id,
'mouse_pwd':mouse_pwd,
'mouse_pwd_t':mouse_pwd_t,
'__type__': 'reply',
'mouse_pwd_isclick':'0',
'_BSK' : self.solve_bsk(self.tbs)
}
#pprint (data_form)
buffer = StringIO()
data_post = urllib.urlencode(data_form)
url = 'https://tieba.baidu.com/f/commit/post/add'
self.c.setopt(pycurl.URL, url)
self.c.setopt(pycurl.POST, 1)
self.c.setopt(pycurl.POSTFIELDS, data_post)
self.c.setopt(self.c.WRITEFUNCTION, buffer.write)
self.c.setopt(pycurl.VERBOSE, 0)
self.c.perform()
response = buffer.getvalue() #here we got the response data
response_json = json.loads(response)
is_succeed=response_json["no"] # get tbs
if is_succeed==0:
print "comment successfully!"
else:
pprint (response_json)
print "comment failed!"
def Get_size_of_url_img(self,url_img):
fp = open("img_upload", "wb")
img_c = pycurl.Curl()
USER_AGENT = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36'
img_c.setopt(pycurl.PROXY, 'http://192.168.87.15:8080')
img_c.setopt(pycurl.PROXYUSERPWD, 'LL66269:')
img_c.setopt(pycurl.PROXYAUTH, pycurl.HTTPAUTH_NTLM)
img_c.setopt(img_c.FOLLOWLOCATION, 1)
img_c.setopt(pycurl.VERBOSE, 0)
img_c.setopt(pycurl.FAILONERROR, True)
img_c.setopt(pycurl.USERAGENT, USER_AGENT)
img_c.setopt(pycurl.URL, url_img)
img_c.setopt(pycurl.WRITEDATA, fp)
img_c.perform()
img_c.close()
fp.close()
im=Image.open("img_upload")
width, height = im.size
#if width>550:
# print "width is suggested to be less than 550"
#print pos.stat("img_upload").st_size
url_upload="[img pic_type=1 width="+ str(width) +" height="+str(height)+"]"+url_img+"[/img]"
return(url_upload)
def my_reply(self,page):
print u"我回复的:\n"
my_reply_link="http://tieba.baidu.com/i/i/my_reply?&pn="+str(page)
buffer = StringIO()
self.c.setopt(pycurl.URL, my_reply_link)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
body = buffer.getvalue().decode('utf-8', 'ignore')
doc = lxml.html.fromstring(body)
reply_block = doc.xpath("//div[contains(@class, 'block t_forward clearfix ')]")
for block in reply_block:
if block.xpath(".//a[contains(@class, 'for_reply_context')]"):
block_my_reply=block.xpath(".//a[contains(@class, 'for_reply_context')]")[0].text_content()
else:
block_my_reply=u"emoji or pure picture"
block_common_source_main=block.xpath(".//div[contains(@class, 'common_source_main')]")[0]
tiezi_url="http://tieba.baidu.com"+block_common_source_main.xpath("./a[1]/@href")[0]
tiezi_title=block_common_source_main.xpath("./a[1]")[0].text
tiezi_text=block_common_source_main.text_content()
reply_num=re.search(r"(\(\d*\))", tiezi_text).group(1)
block_tieba_name=block_common_source_main.xpath("./a[3]")[0].text
print ("'"+block_my_reply+"'").encode("gb18030")
print ("from: "+tiezi_title + " "+reply_num + " -- "+block_tieba_name).encode("gb18030")
print "url: "+tiezi_url
print """
"""
def my_forum(self):
#http://tieba.baidu.com/mo/q---995BABCDC4E864DFC079CE055F7D0C57%3AFG%3D1--0-1-0--2/m?tn=bdFBW&tab=favorite
print u"我关注的贴吧:\n"
my_forum_link="http://tieba.baidu.com/mo/q---995BABCDC4E864DFC079CE055F7D0C57%3AFG%3D1--1-3-0--2--wapp_1499966495430_639/m?tn=bdFBW&tab=favorite"
buffer = StringIO()
self.c.setopt(pycurl.URL, my_forum_link)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
body = buffer.getvalue()#.decode('utf-8', 'ignore')
doc = lxml.html.fromstring(body)
pagination = doc.xpath("//tr")
#print pagination
max_width = 20
for each_page in pagination:
each_forum=each_page.xpath("./td[1]")[0].text_content()
each_forum_level=each_page.xpath("./td[2]")[0].text_content()
poster_fmt= u'{0:<%s}' % (max_width - self.wide_chars(each_forum))
print (poster_fmt.format(each_forum) +" <"+ each_forum_level +">").encode("gb18030")
def my_tie(self):
print u"我的贴子:\n"
my_forum_link="http://tieba.baidu.com/i/i/my_tie"
buffer = StringIO()
self.c.setopt(pycurl.URL, my_forum_link)
self.c.setopt(self.c.WRITEDATA,buffer)
self.c.perform()
body=buffer.getvalue().decode('utf-8', 'ignore')
doc=lxml.html.fromstring(body)
tiezi_list=doc.xpath("//div[@class='simple_block_container']/ul/li")
#http://tieba.baidu.com/p/5228903492?pid=109502281993
for each_tiezi in tiezi_list:
tiezi_text=each_tiezi.text_content()
tiezi_link="http://tieba.baidu.com"+each_tiezi.xpath(".//a[@class='thread_title']/@href")[0]
print (tiezi_text).encode("gb18030")
print tiezi_link
print "---\n\n"
def onekeySignin(self):
#'tbs': '2b506030c2989d171500408206'
#my_forum_link="https://tieba.baidu.com/index.html"
#file_out = codecs.open("mao_out.txt", "w", "utf-8")
#buffer = StringIO()
#self.c.setopt(pycurl.URL, my_forum_link)
#self.c.setopt(self.c.WRITEDATA,buffer)
#self.c.perform()
#body=buffer.getvalue().decode('utf-8', 'ignore')
#file_out.write(body)
#tbs=re.search(r"PageData\.tbs.*\"(.*)\"", body).group(1)
data_form = {
'ie': 'utf-8',
'kw':self.tiebaName_utf.encode("utf-8"),
'tbs': self.tbs,
}
buffer = StringIO()
data_post = urllib.urlencode(data_form)
url = 'https://tieba.baidu.com/sign/add'
self.c.setopt(pycurl.URL, url)
self.c.setopt(pycurl.POST, 1)
self.c.setopt(pycurl.POSTFIELDS, data_post)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
response = buffer.getvalue() #here we got the response data
response_json = json.loads(response)
is_succeed=response_json["error"]
if is_succeed=="":
print u"签到成功!"
else:
print (is_succeed).encode("gb18030")
def like(self):
data_form = {
'fid': self.fid ,
'ie': 'gbk',
'fname':self.tiebaName_utf.encode("utf-8"),
'uid' :self.name_url,
'tbs': self.tbs,
}
buffer = StringIO()
data_post = urllib.urlencode(data_form)
url = 'http://tieba.baidu.com/f/like/commit/add'
self.c.setopt(pycurl.URL, url)
self.c.setopt(pycurl.POST, 1)
self.c.setopt(pycurl.POSTFIELDS, data_post)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
response = buffer.getvalue() #here we got the response data
response_json = json.loads(response)
is_succeed=response_json["error"]
level_name=response_json["level_name"]
#pprint (response_json)
if is_succeed=="":
print u"已关注"
print (u"本吧头衔: "+level_name).encode("gb18030")
else:
print u"关注失败"
def dislike(self):
data_form = {
'fid': self.fid ,
'ie': 'gbk',
'fname':self.tiebaName_utf.encode("utf-8"),
'uid' :self.name_url,
'tbs': self.tbs,
}
buffer = StringIO()
data_post = urllib.urlencode(data_form)
url = 'http://tieba.baidu.com/f/like/commit/delete'
self.c.setopt(pycurl.URL, url)
self.c.setopt(pycurl.POST, 1)
self.c.setopt(pycurl.POSTFIELDS, data_post)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
response = buffer.getvalue() #here we got the response data
response_json = json.loads(response)
#pprint (response_json)
is_succeed=response_json["data"]["ret"]["is_done"]
if is_succeed==True:
print u"已取消关注"
else:
print u"取消关注失败,或者未关注该吧"
def replyme(self):
print u"回复我的:\n"
replyme_link="http://tieba.baidu.com/i/i/replyme"
buffer = StringIO()
self.c.setopt(pycurl.URL, replyme_link)
self.c.setopt(self.c.WRITEDATA, buffer)
self.c.perform()
body = buffer.getvalue().decode('utf-8', 'ignore')
doc = lxml.html.fromstring(body)
reply_list = doc.xpath("//div[@id='feed']/ul/li")
max_width = 20
for each_reply in reply_list:
replyme_user=each_reply.xpath(".//div[@class='replyme_user']")[0].text_content()
replyme_content=each_reply.xpath(".//div[@class='replyme_content']")[0].text_content()
replyme_url=each_reply.xpath(".//div[@class='replyme_content']/a/@href")[0]
replyme_url="http://tieba.baidu.com"+replyme_url
feed_from=each_reply.xpath(".//div[@class='feed_from']")[0].text_content()
poster_fmt= u'{0:<%s}' % (max_width - self.wide_chars(replyme_user))
print (poster_fmt.format(replyme_user) +" replied: "+ replyme_content +" \n\n -- " + feed_from.lstrip().rstrip() +"" ).encode("gb18030")
print "url: "+replyme_url
print """
"""
def Get_Back_To_shouye(self):
print "************Shouye Layer************"
i=0
for Header,each_title,Tail in self.shouye_titles:
Header_fmt= u'{0:<%s}' % (self.header_max_width - self.wide_chars(Header))
title_fmt= u'{0:<%s}' % (self.title_max_width - self.wide_chars(each_title))
try:
print (Header_fmt.format(Header)+title_fmt.format(each_title) + Tail).encode("gb18030")
except:
print (Header_fmt.format(Header)+"Title can't be displayed").encode("gb18030")
i=i+1
print ""
print "\n---------------------"
def wide_chars(self,s):
#return the extra width for wide characters
if isinstance(s, str):
s = s.decode('utf-8')
return sum(unicodedata.east_asian_width(x) in ('F', 'W') for x in s)
def encode_utf_html(self,input_unicode):
res = []
for b in html:
o = ord(b)
if o > 255:
res.append('&#{};'.format(o))
else:
res.append(b)
res_string = ''.join(res)
return(res_string)
def Refresh_tiezi(self):
if self.last_viewed_tiezi_index>0:
self.go_into_each_post(self.last_viewed_tiezi_index)
else:
print "you haven't viewed any tiezi yet"
def Refresh_shouye(self):
self.shouye(1)
def exit(self):
self.c.close()
# main function
app=Browser_tieba()
while True:
print """
"""
nb = raw_input('Give me your command (or type help to see your options): \n')
try:
if nb.startswith( 's ' )==True:
sp=re.search(r"s\s+(\d+)", nb).group(1)
sp=int(sp)
if sp>=1:
app.shouye(sp)
elif nb.startswith( 'help' )==True:
help="----- Help for different command -----\n"
a="a -Begin to surf around tieba with its name (First step !)\n"
s="s -go to specific pages; How to use: (s 10)\n"
t="t -go to specific tiezi; How to use: (t 12) or (t https://tieba.baidu.com/p/4803063434)\n"
pic="pic - launch image viewer to browse all the picutures in current thread\n"
p="p - make a new post\n"
r="r -reply to either OP or to a specific floor; How to use: (r) or (r 12)\n"
lzl="lzl -view lzl content for a specific floor; How to use: (lzl 12)\n"
zklz="zklz - View comment made by OP only;\n"
f="f -refresh posts in shouye;\n"
ft="ft -refresh comments for the current post;\n"
b="b -go back to the list of all the posts in shouye; \n"
mf="mf -view all your favorite tieba ; \n"
mr="mr -view your most recent comments ; \n"
rm="rm -view who replied to you ; \n"
mt="mt -view all thread posted by you \n"
signin="si - sign in (si) \n"
like = "like - like this forum (like); dislike this forum (dislike)\n"
e="e -exit the browser;\n"
c="c -clear the screen;\n"
end="--------------------------------------"
print help+a+s+t+pic+p+r+zklz+lzl+f+ft+b+mf+mr+rm+signin+like+e+c+end
elif nb.startswith( 't ' )==True:
index=re.search(r"t\s+(.*)", nb).group(1)
try:
index=int(index)
if index>=50 or index <=0:
print "put correct index: 1-49"
continue
else:
app.go_into_each_post(index)
app.last_viewed_tiezi_index=index
except:
if 'fid' in index:
index = raw_input('exclude fid part in url and type again: \n')
app.go_into_each_post(index)
app.last_viewed_tiezi_index=index
elif nb.startswith( 'r ' )==True:
floor_num=re.search(r"r\s+(\d+)", nb).group(1)
app.Reply_to_floor(floor_num)
elif nb.startswith( 'lzl ' )==True:
floor_num=re.search(r"(\d+)", nb).group(1)
app.lzl_more(floor_num)
elif nb =="r":
app.Reply_this_post()
elif nb =="rm":
app.replyme()
elif nb =="mt":
app.my_tie()
elif nb =="p":
app.Make_New_Post()
elif nb =="mr":
app.my_reply(1)
elif nb =="mf":
app.my_forum()
elif nb =="si":
app.onekeySignin()
elif nb =="like":
app.like()
elif nb =="dislike":
app.dislike()
elif nb.startswith( 'mr ' )==True:
page_num=re.search(r"(\d+)", nb).group(1)
app.my_reply(page_num)
elif nb=="pic":
app.view_image()
elif nb == "f":
print "refreshing shouye"
app.Refresh_shouye()
elif nb =="b":
app.Get_Back_To_shouye()
elif nb =="e":
break
elif nb =="c":
os.system('cls') # on windows
elif nb =="a":
app.change_tieba()
elif nb.startswith( 'zklz' )==True:
app.zklz=True
print u"只看楼主"
app.Refresh_tiezi()
app.zklz=False
elif nb.startswith( 'url:' )==True:
app.Refresh_tiezi()
elif nb =="ft": # refresh this post only
app.Refresh_tiezi()
else:
print "Please type the correct command"
except:
print ""
print """
_ _
| | | |
| |__ _ _ _____ | |__ _ _ _____
| _ \| | | | ___ | | _ \| | | | ___ |
| |_) ) |_| | ____| | |_) ) |_| | ____|
|____/ \__ |_____) |____/ \__ |_____)
(____/ (____/
"""
app.exit()