python 필기/노트

728x90

공부하다가 몰랐거나 유용한 것들에 대한 필기 !

딕셔너리(dictionary)

딕셔너리는 다음과 같이 선언하고 key값과 value값의 형태로 만들어주는 자료구조이다.

a = dict()
a['market'] = {'fruit':"apple", 'drink':'beer'}

a.get('market').get('fruit')
# 결과: "apple"

update: 딕셔너리와 딕셔너리를 합쳐서 하나의 딕셔너리를 만들 때 사용하는 함수

d = {1: "one", 2: "three"}
d1 = {2: "two"}

# updates the value of key 2
d.update(d1)
print(d)

d1 = {3: "three"}

# adds element with key 3
d.update(d1)
print(d)

# {1: 'one', 2: 'two'}
# {1: 'one', 2: 'two', 3: 'three'}


d = {'x': 2}

d.update(y = 3, z = 0)
print(d)
# {'x': 2, 'y': 3, 'z': 0}

setdefault(): 이미 있는 딕셔너리에 값을 확인하거나, 만약 키가 없는 경우에는 새로운 쌍을 추가해줄 수 있다.

Dictionary1 = { 'A': 'Geeks', 'B': 'For', 'C': 'Geeks'} 
  
# using setdefault() method 
Third_value = Dictionary1.setdefault('C') 
print("Dictionary:", Dictionary1) 
print("Third_value:", Third_value) 

# 결과
# Dictionary: {'A': 'Geeks', 'C': 'Geeks', 'B': 'For'}
# Third_value: Geeks


Fourth_value = Dictionary1.setdefault('D', 'Geeks') 
print("Dictionary:", Dictionary1) 
print("Fourth_value:", Fourth_value) 

# 결과
# Dictionary: {'A': 'Geeks', 'B': 'For', 'C': None, 'D': 'Geeks'}
# Fourth_value: Geeks

리스트

count: list.count("elements that you wanna know how many") : 리스트 속 원하는 원소 개수를 세어준다.

pandas 데이터프레임

set_index('컬럼명') : 데이터 프레임의 한 열을 인덱스로 바꿔버리기

df.set_index('month')


##################
       year  sale
month
1      2012    55
4      2014    40
7      2013    84
10     2014    31
##################

NA있는 행 제거하기

DataFrame.dropna()

dropna()안에 들어갈 옵션들

subset=['name', 'born'] # 원하는 컬럼에 대해 na인 경우 제거
how='all' # 모든 컬럼에 대해 na인 경우 제거
thresh=2 # 적어도 2개 이상이 na인 경우 제거
inplace=True # 같은 변수 그대로 사용할 것인지 선택

조건을 주어 특정 행을 추출하기.

- "column1"이 "apple"인 조건을 만족하는 결과값을 데이터프레임으로 리턴시켜준다.

dataframe.loc[dataframe["column1"] == "apple"]

중복제거: DF.drop_duplicates, DF.duplicated (['컬럼명'], keep='first or last or False)

인덱스 다시 재배열하기

- 오름차순으로 인덱스에 값을 다시 부여하게 된다.

df = df.reset_index(drop=True)

데이터 병합 1 . 같은 형식의 데이터를 열 또는 행으로 이어 붙이기만 하면되는경우에 사용합니다. 형식이 완전히 같아줘야 합니다.

pd.concat( dataframes, axis=0)

데이터 병합2 . JOIN을 하는경우는 아래처럼 여러가지가 있습니다.

함수

정렬
- sorted(*, key=None, reverse=False)

sorted([4,5,2,3])
## "[4,5,2,3].sort()" 와 동일한 결과를 보여줍니다.

딕셔너리에서의 sort는 정렬 기준과 출력과를 위해 특히 주의해야한다.

sorted({1: 'D', 2: 'B', 3: 'B', 4: 'E', 5: 'A'})
## 결과 [1, 2, 3, 4, 5]

아래의 코드 중 위의 것은 value를 기준으로 정렬하고 아래의 것은 key 기준으로 정렬하여 아래의 결과값을 출력합니다.

import operator
x = {'1': 2, '3': 4, '4': 3, '2': 1, '0': 0}
sorted_x = sorted(x.items(), key=operator.itemgetter(1))

## 결과: [('0', 0), ('2', 1), ('1', 2), ('4', 3), ('3', 4)]

import operator
x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
sorted_x = sorted(x.items(), key=operator.itemgetter(0))

정렬
- 데이터프레임.sort_values(by="컬럼명", inplace=False, ascending=True): inplace는 현제 데이터 프레임에 그대로 저장할건지를 묻는 것이고, ascending은 오름/내림 차순을 결정한다.
좌우 공백/'\n' 제거: strip()

"\nadsf  ".strip()
# 결과 : 'adsf'

nan 값 확인
- math.isnan(float값) : 안에 float가 nan인지 True/False로 확인

특정 행/열 제거
- dataframe.drop('인덱스명', axis=0): axis=0은 행 기준
- dataframe.drop('컬럼명', axis=1): axis=1은 열 기준

내적, 놈(product, norm)
- np.linalg.norm([1,2,3]): 벡터[1,2,3]의 놈 값
- np.dot([1,2,3],[2,2,2]): 두 벡터의 내적 값
날짜와 시간 가져오기
- from datetime import datetime
- print (datetime.now()): 현재 시간을 출력
입력한 시간을 다루기 위한 함수: time.strptime. 시간을 쪼개주기 때문에 계산 같은거 할 때 유용하게 사용할 수 있다.

import time
time.strptime('20:28:11',"%H:%M:%S")

파이썬 객체 데이터를 외부에 파일로 저장하기
- import shelve
- 위의 부분은 데이터를 'result1.db'파일로 cwd에 저장, 아래 부분은 cwd의 'result1.db'를 다시 read 합니다.

import shelve 

with shelve.open('result1.db') as f: 
    f['obv1'] = review    



import shelve 

with shelve.open('result.db') as f: 
    a=f['obv1']

python pandas 데이터 프레임 csv로 저장
- df.to_csv(파일경로+파일이름 까지, na_rep='NaN)
python pandas 데이터 프레임 excel로 저장
- df.to_excel(파일)

# 디렉토리 내의 파일 출력
import os
os.listdir()

txt 파일 쓰기: 'r' ,'a', 'w' 가 있음.

with open(str('entertain.txt'), 'a', encoding='utf-8') as f:
    for i in NEWS[5]:
        f.write(i)

모듈이 있는 경로 출력: 예시에선 os 모듈을 사용.

import os
path = os.path.abspath(os.__file__)

피클파일 저장하고 로드하기: pickle

# save
with open('파일명.pickle', 'wb') as f:
    pickle.dump(변수이름, f, pickle.HIGHEST_PROTOCOL)
    
    
    
    # load
with open('파일명.pickle', 'rb') as f:
    새로운 변수이름 = pickle.load(f)

numpy array 곱, 행렬곱. np.matmul(a,b)

a = np.array([[1, 0],
              [0, 1]])
b = np.array([1, 2])
np.matmul(a, b)
#  array([1, 2])
np.matmul(b, a)
#  array([1, 2])

728x90

'python' 카테고리의 다른 글

scatter plot 그리기 (0)	2020.05.14
쇠막대기 문제 (0)	2020.03.24
히트맵 그리는 간단한 코드(matplotlib.pyplot) (0)	2020.03.11
주피터 노트북에서 py파일 불러오기2 (0)	2019.12.26
주피터 노트북에서 py파일 불러오기 (1)	2019.12.26

데하

python 필기/노트

함수

'python' 카테고리의 다른 글

댓글

티스토리툴바

python 필기/노트

함수

'python' 카테고리의 다른 글

관련글

댓글

티스토리툴바