一 数据结构和算法
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
'''不可变:数字,字符串,元组可变:列表,字典原子:数字,字符串容器:列表,元组,字典直接访问:数字顺序:字符串,列表,元组映射访问:字典'''#一一对应a,b,c,d,e='hello'print(e,d)#少一个报错# a,b,c='hello'#*号的使用a,*_,e='hello'print(a,e)#列表中的元素解压也是一一对应的关系data=['mac',10000,[2016,10,12]]name,price,date=dataprint(name,price,date)#截取子元素name,price,[year,month,day]=dataprint(name,price,year,month,day)#无用的元素用_代替data=['mac',10000,[2016,10,12]]_,price,_=dataprint(price)#头截取,尾截取record=['lhf','male',18,'12345@qq.com','1861131211']*_,phone=recordname,*_=recordprint(phone)print(name)#解压可迭代对象赋值给多个值#一个八个月的业绩与前七个月的平均值比较sales_record=[11,12,3,7,9,6,3,5]*head,tail=sales_recordprint(head,tail)print(sum(head)/len(head))if tail > sum(head)/len(head): print('第八个月业绩高于前七个月平均值')elif tail == sum(head)/len(head): print('第八个月业绩等于前七个月平均值')else: print('第八个月业绩小于前七个月平均值')#解压元素在函数中的应用records=[ ('say_hi','hello'), ('calculate',10,20,30,40), ('dic_handle','name','lhf')]def say_hi(msg): print(msg)def calculate(l): res=0 for i in l: res+=i print(res)def dic_handle(l): key,val=l dic={key:val} print(dic)for func,*args in records: if func == 'say_hi': say_hi(args) elif func == 'calculate': calculate(args) elif func == 'dic_handle': dic_handle(args)#linux系统中用户记录record='root:x:0:0:super user:/root:/bin/bash'*head,home_dir,_=record.split(':')print('家目录',home_dir)
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'from collections import deque# #实现队列,不指定大小则无限添加# d=deque(maxlen=3)# d.append(1)# d.append(2)# d.append(3)# print(d)# d.append(4)# print(d)# d.appendleft(5)# print(d)# print(d.pop())# print(d.popleft())def search(file,pattern,max_len=5): pre_lines=deque(maxlen=max_len) for line in file: if pattern in line: yield pre_lines,line pre_lines.append(line)if __name__ == '__main__': with open('测试文件') as file: for pre_l,line in search(file,'Exchange'): print('-'*60) for i in pre_l: print(i) print('匹配行----->',line)
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'import heapqnums=[1,2,3,-10,100,30,200,21,9,7]print(heapq.nlargest(3,nums))print(heapq.nsmallest(3,nums))portfolio=[ { 'name':'IBM','shares':100,'price':91.1}, { 'name':'AAPL','shares':50,'price':532.1}, { 'name':'FB','shares':200,'price':21.01}, { 'name':'HPQ','shares':35,'price':32.75}, { 'name':'YHOO','shares':45,'price':16.35}, { 'name':'ACME','shares':75,'price':115.65}]cheap=heapq.nsmallest(3,portfolio,key=lambda x:x['price'])print(cheap)'''如果你想在一个集合中查找最小或最大的N个元素,并且N小于集合元素数量, 那么这些函数提供了很好的性能。因为在底层实现里面,首先会先将集合数据进行堆排序后放入一个列表中'''heapq.heapify(nums)print(nums)'''堆数据结构最重要的特征是heap[0]永远是最小的元素。并且剩余的元素可以很 容易的通过调用heap.heappop()方法得到,该方法会先将第一个元素弹出来,然后 用下一个最小的元素来取代被弹出元素(这种操作时间复杂度仅仅是O(log N),N是堆大小。)比如,如果想要查找最小的3个元素,你可以这样做:'''print(heapq.heappop(nums))print(heapq.heappop(nums))print(heapq.heappop(nums))'''nlarges(),nsmallest():当要查找的元素个数相对较小时使用min(),max():就是要找唯一一个最大的或者最小的值时使用sort[items][n:m]:当要查找的元素个数相接近items长度时使用'''
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'import heapqclass PriotryQueue: def __init__(self): self._queue=[] self._index=0 def push(self,item,priotry): heapq.heappush(self._queue,(-priotry,self._index,item)) self._index+=1 def pop(self): return heapq.heappop(self._queue)[-1]class Item: def __init__(self,name): self.name=name def __str__(self): return self.name def __repr__(self): # return self.name return 'Item({!r})'.format(self.name)q=PriotryQueue()q.push(Item('镇长'),1)q.push(Item('省长'),4)q.push(Item('主席'),5)q.push(Item('市长'),3)q.push(Item('县长'),2)print(q._queue)print(q.pop())print(q.pop())print(q.pop())print(q.pop())print(q.pop())
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng''''一个字典就是一个键对应一个单值的映射。如果你想要一个键映射多个值,那么你就需要将这多个值放到另外的容器中,比如列表或者集合里面。比如,你可以像下面 这样构造这样的字典:'''people={ 'name':['alex','李杰'], 'hobby':['play','coding']}project={ 'company':{ 'IBM':'CTO','Lenovo':'CEO','baidu':'COO','Alibaba':'UFO'}, 'applicant':['小白','lhf','武藤兰']}'''选择使用列表还是集合取决于你的实际需求。如果你想保持元素的插入顺序就应该使用列表,如果想去掉重复元素就使用集合(并且不关心元素的顺序问题)。你可以很方便的使用collections模块中的defaultdict来构造这样的字典。 的一个特征是它会自动初始化每个 刚开始对应的值,所以你只需要 关注添加元素操作了。比如:'''from collections import defaultdictd=defaultdict(list)d['teacher'].append('alex')d['teacher'].append('wupeiqi')d['teacher'].append('wuqiqi')d['boss'].append('oldboy')d['boss'].append('alex')d_set=defaultdict(set)d_set['a'].add(1)d_set['a'].add(2)d_set['a'].add(3)print(d,d_set)#setdefaultd={}d.setdefault('a',[]).append(1)d.setdefault('a',[]).append(2)d.setdefault('a',[]).append(2)print(d)#自己实现一个一键多值字典l=[ ('teacher','alex'), ('teacher','lhf'), ('teacher','papa'), ('boss','alex'), ('boss','wupeiqi'),]d={}for k,v in l: if k not in d: d[k]=[] d[k].append(v)print(d)#用defaultdict实现,更优雅d=defaultdict(list)for k,v in l: d[k].append(v)print(d)
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'from collections import OrderedDictd=OrderedDict()d['dream1']='先挣他妈一个亿'd['dream2']='然后周游全世界'd['dream3']='再娶他妈七八个媳妇'd['dream4']='洗洗脸,然后梦就醒了'for key in d: print(key,d[key])# import json# print(json.dumps(d))'''OrderedDict内部维护着一个根据键插入顺序排序的双向链表每次当一个新的元素插入进来的时候,它会被放到链表的尾部对于一个已经存在的键的重复赋值不会改变键的顺序。需要注意的是,一个OrderedDict的大小是一个普通字典的两倍,因为它内部维 护着另外一个链表。所以如果你要构建一个需要大量OrderedDict实例的数据结构的时候,那么你就得仔细 权衡一下是否使用OrderedDict带来的好处要大过额外内存消耗的影响。'''
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'prices={ 'ACME':45.23, 'AAPL':612.78, 'IBM':205.55, 'HPQ':37.20, 'FB':10.75}#zip()创建的是只能访问一次的迭代器,下面的max()会报错# prices_and_names=zip(prices.values(),prices.keys())# min_price=min(prices_and_names)# max_price=max(prices_and_names)#单纯的min(prices)是按照key来取值min_price=min(zip(prices.values(),prices.keys()))max_price=max(zip(prices.values(),prices.keys()))print(min_price)print(max_price)print(min(prices,key=lambda k:prices[k]))print(max(prices,key=lambda k:prices[k]))'''需要注意的是在计算操作中使用到了(值,键)对。当多个实体拥有相同的值的时 候,键会决定返回结果。比如,在执行min()和max()操作的时候,如果恰巧最小或 最大值有重复的,那么拥有最小或最大键的实体会返回'''prices={ 'A':45.23,'Z':45.23}print(min(zip(prices.values(),prices.keys())))
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'a={ 'x':1, 'y':2, 'z':3,}b={ 'w':1, 'x':2, 'y':3,}print(a.keys() & b.keys())print(a.keys() - b.keys())print(a.items() - b.keys())#生成一个新的字典,去掉某些keyc={key:a[key] for key in a.keys() - { 'z','w'}}print(c)
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'a=[1,5,2,1,9,1,5,10]#如果想简单去重,可以使用set,但是set是无序的# print(set(a))#如果序列的值都是hashable类型,那么可以简单利用集合或者生成器来解决这个问题def dedupe(items): seen=set() for i in items: if i not in seen: yield i seen.add(i)print(list(dedupe(a)))#如果序列元素是不可hashable类型a=[ { 'name':'alex','age':18}, { 'name':'alex','age':100}, { 'name':'alex','age':100}, { 'name':'lhf','age':18},]def dedupe(items,key=None): seen=set() for i in items: k=i if not key else key(i) if k not in seen: yield i seen.add(k)print(list(dedupe(a,key=lambda k:(k['name'],k['age']))))#去除文件中相同的内容用第一种方法即可
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'#普通切片,一堆硬编码record='苍井空'record1='abcde'print(record[1:3])print(record1[1:3])#命名切片,减少硬编码record='苍井空 18621452550 沙河汇德商厦'phone=slice(4,15)addr=slice(16,21)print(record[phone])print(record[addr])'''一般来讲,代码中如果出现大量的硬编码下标值会使得可读性和可维护性大大降 低。比如,如果你回过来看看一年前你写的代码,你会摸着脑袋想那时候自己到底想 干嘛啊。这里的解决方案是一个很简单的方法让你更加清晰的表达代码到底要做什么。内置的slice()函数创建了一个切片对象,可以被用在任何切片允许使用的地方。'''s=slice(5,50,2)print(s.start)print(s.stop)print(s.step)
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'from collections import Counterwords=['if', 'you', 'were', 'a', 'you,I', 'would', 'never','if', 'would', 'if']# d={}# for word in words:# if word not in d:# d[word]=1# else:# d[word]+=1# print(d)word_counts=Counter(words)# print(word_counts)#统计出现频率最高的3个单词print(word_counts.most_common(2))#可以像字典一样取值print(word_counts['if'])#新增单词more_words=['if','if']for word in more_words: word_counts[word]+=1print(word_counts)#或者直接使用word_counts.update(more_words)print(word_counts)#counter实例可以进行数学运算a=Counter(words)b=Counter(more_words)print(a)print(b)print(a-b)print(b-a)print(a+b)
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
文件conf.txt内容global log 127.0.0.1 local2 daemon maxconn 256 log 127.0.0.1 local2 infodefaults log global mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms option dontlognulllisten stats :8888 stats enable stats uri /admin stats auth admin:1234frontend oldboy.org bind 0.0.0.0:80 option httplog option httpclose option forwardfor log global acl www hdr_reg(host) -i www.oldboy.org use_backend www.oldboy.org if wwwbackend www.oldboy.org server 100.1.7.9 100.1.7.9 weight 20 maxconn 3000原配置文件deque_test.py内容from collections import dequeimport redef conf_dic(f): dic={} for line in f: if re.match('[a-zA-Z]',line): key=line elif re.match('^ ',line): dic.setdefault(key,[]).append(line) return dicif __name__ == '__main__': with open('conf.txt',encoding='utf-8') as f: dic=conf_dic(f) for i in dic: print('%s' %i,end='') for line in dic[i]: print(line,end='') print('-'*20)
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Alex Li'from operator import itemgetterrows=[ { 'fname':'Brian1','lname':'Jones1','uid':1003}, { 'fname':'Brian2','lname':'Jones2','uid':1002}, { 'fname':'Brian3','lname':'Jones3','uid':1001}, { 'fname':'Brian4','lname':'Jones4','uid':1004},]# rows_by_uid=sorted(rows,key=lambda rows:rows['uid'])# for i in rows_by_uid:# print(i)#### rows_by_uid=sorted(rows,key=itemgetter('uid'))# for i in rows_by_uid:# print(i)rows_by_lfname=sorted(rows,key=itemgetter('lname','fname'))print(rows_by_lfname)for i in rows_by_lfname: print(i)
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Alex Li'#要生成列表l=[]for i in range(6): i*=2 l.append(i)print(l)y=[i*2 for i in range(6)]print(y)def func(n): return n+10z=[func(i) for i in range(6)]print(z)l=[1,2,3,-1,-10,4,5]#过滤掉负数l_new=[i for i in l if i >= 0]print(l_new)
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Alex Li'#生成器,取一次生成一个值,只能next不能回退,因为只有一个l=[i for i in range(10000000)] #机器卡死g=(i for i in range(1000000)) #一秒生成g.__next__()#next到最后报异常def fib(n1,n2,count=0): if count > 10:return if count == 0: print('',n1,end='') x=n2 n2=(n1+n2) n1=x print(' ',n2,end='') count+=1 fib(n1,n2,count)## 0 1 1 2 3 5 8 13 21# fib(0,1)#def fib2(max=10): n,a,b=0,0,1 while n < max: # x=b # b=b+a # a=x yield b a,b=b,a+b # print(' ',b,end='') n+=1 return 'done'x=fib2()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()x.__next__()# while True:# try:# print(x.__next__())# except Exception as e:# print(e)# break
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Alex Li'import timedef consumer(name): print('[%s]准备吃包子啦' %name) while True: baozi=yield print('包子[%s]来了,被[%s]吃了' %(baozi,name))# c=consumer('alex')# c.__next__()# c.send('韭菜馅的')def producter(): c1=consumer('alex') c2=consumer('wupeiqi') c1.__next__() c2.__next__() print('开始做包子啦') for i in range(10): time.sleep(1) c1.send(i) c2.send(i)producter()f=open('a.txt')f.__next__()
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Alex Li'name='lhf'passwd='123'def auth(auth_type): def inner_auth(func): def _wrapper(*args,**kwargs): username=input('username: ') password=input('passwd: ') if auth_type == 'local': if username == name and password ==passwd: print('user login successfull') res=func(*args,**kwargs) else: exit('log err') elif auth_type == 'ldap': print('搞毛线ldap,谁特么会') return _wrapper return inner_authdef index(): print('welcome to index page')@auth(auth_type='local')def home(): print("welcome to home page")@auth(auth_type='ldap')def bbs(): print('welcome to bbs page')index()home()bbs()
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'prices={ 'ACME':45.23, 'AAPL':612.78, 'IBM':205.55, 'HPQ':37.20, 'FB':10.75}prices_new={key:val for key,val in prices.items() if val > 200}print(prices_new)
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
![](https://images.cnblogs.com/OutliningIndicators/ExpandedBlockStart.gif)
#_*_coding:utf-8_*___author__ = 'Linhaifeng'from collections import namedtuple#商品,购买个数,单价records=[ ('mac',2,20000), ('lenovo',1,3000), ('apple',0,10), ('tesla',10,1000000)]'''命名元组的一个主要用途是将你的代码从下标操作中解脱出来。因此,如果你从数 据库调用中返回了一个很大的元组列表,通过下标去操作其中的元素,当你在表中添 加了新的列的时候你的代码可能就会出错了。但是如果你使用了命名元组,那么就不会有这样的顾虑。为了说明清楚,下面是使用普通元组的代码:'''cost=0.0for rec in records: cost+=rec[1]*rec[2] print('商品:%s 购买个数:%s,总价格为:%s' %(rec[0],rec[1],cost))#使用命名元祖后sk=namedtuple('Stock',['name','count','price'])for rec in records: s=sk(*rec) print(s.count*s.price)p=namedtuple('People',['name','gender','age'])l=['alex','femal',18]p1=p(*l)print(p1)print(p1.name)print(p1.age)print(p1.gender)'''命名元组另一个用途就是作为字典的替代,因为字典存储需要更多的内存空间。如果你需要构建一个非常大的包含字典的数据结构,那么使用命名元组会更加高效。但 是需要注意的是,不像字典那样,一个命名元组是不可更改的。比如:'''p=namedtuple('People',['name','gender','age'])l=['alex','femal',18]p1=p(*l)print(p1.name)# p1.name='sb'#报错,不可修改p1=p1._replace(name='sb')#需要重新赋值给p1print(p1.name)#可以新建一个函数,弥补必须使用_replace才能修改元素的缺点p=namedtuple('People',['name','gender','age'])p1=p('','',None)def dict_to_stock(s): return p1._replace(**s)print(dict_to_stock({ 'name':'alex','gender':'f','age':18}))print(dict_to_stock({ 'name':'sb','gender':'f','age':18}))