Python Pandas(Panel Data)

Notice

Recent Posts

Recent Comments

Link

« 2025/08 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Tags more

Archives

Today

Total

관리 메뉴

기계는 거짓말하지 않는다

Python Pandas(Panel Data) 본문

Python

Python Pandas(Panel Data)

KillinTime 2021. 7. 1. 23:22

Pandas

파이썬에서 데이터 분석, 조작을 위해 사용되는 라이브러리이다.

Pandas에서 제공하는 데이터 자료구조는 Series와 DataFrame 두 가지가 존재한다.

Series는 시계열(time series: 일정 시간 간격으로 배치된 데이터들의 수열)과 유사한 데이터로써 index와 value가 있고, DataFrame은 딕셔너리 데이터를 매트릭스 형태로 만들어 준 것 같은 frame을 가지고 있다.

이런 데이터 구조를 통해 시계열, 비시계열 데이터를 통합하여 다룰 수 있다.

Install

command 창에서 pip install pandas 입력 (pip 패키지 관리자가 있어야 함)

Pandas를 사용하기 위해 import pandas를 사용

관행적으로 pd 라는 별칭을 사용하여 import pandas as pd로 사용한다.

from pandas import Series, DataFrame 로도 Series, DataFrame 사용 가능

Series

Series의 형태는 딕션이라하고 매우 유사하나 key가 index로 바뀐 정도가 아니고 데이터 구조 자체가 매우 다름

1차원 배열의 각 값에 index를 할당하는 구조

import pandas as pd

fruit = pd.Series([2000, 3800, 1200, 6000], index=['apple', 'banana', 'peer', 'cherry'])
print(fruit)
print('-' * 20)
print(fruit.values) # value 확인
print(fruit.index) # index 확인

fruitData = {'apple':2500, 'banana':3800, 'peer':1200, 'cherry':6000}
fruit = pd.Series(fruitData)
print(type(fruitData), type(fruit))

Name 설정

fruit.name = 'fruitPrice'
print(fruit)
fruit.index.name = 'fruitName'
print('-' * 20)
print(fruit)

DataFrame

2차원 배열의 행, 열에 index, column을 할당하는 구조. 테이블 형태

fruitData = {'fruitName':['apple', 'banana', 'peer', 'cherry'],
            'fruitPrice':[2500, 3800, 1200, 6000],
            'num':[10, 5, 3, 8]}

fruitFrame = pd.DataFrame(fruitData, index=fruitData['fruitName'], columns=['fruitPrice', 'num'])
print(fruitFrame)
print('-' * 20)

vals = [[2500, 10], [3800, 5], [1200, 3], [6000, 8]]
index = ['apple', 'banana', 'peer', 'cherry']
cols = ['fruitPrice', 'num']

fruitFrame2 = pd.DataFrame(vals, index=index, columns=cols)
print(fruitFrame2)

print(fruitFrame.values)
print(fruitFrame.index)
print(fruitFrame.columns)

삭제

fruitFrame2 = fruitFrame.drop(['apple', 'cherry'])
print(fruitFrame2)
print('-' * 20)

fruitFrame3 = fruitFrame.drop('num', axis=1) # axis 행 or 열 삭제
print(fruitFrame3)

컬럼 추가

fruitData = {'fruitName':['apple', 'banana', 'peer', 'cherry'],
            'fruitPrice':[2500, 3800, 1200, 6000],
            'num':[10, 5, 3, 8]}
            
fruitFrame = pd.DataFrame(fruitData, columns=['fruitPrice', 'num', 'fruitName'])

# 컬럼 추가
fruitFrame['Year'] = 2021
print(fruitFrame)
print('-' * 40)

variable = pd.Series([4, 2, 1], index = [0, 2, 3])
fruitFrame['stock'] = variable
print(fruitFrame)

행 추가

fruitData = {'fruitName':['apple', 'banana', 'peer', 'cherry'],
            'fruitPrice':[2500, 3800, 1200, 6000],
            'num':[10, 5, 3, 8]}
            
fruitFrame = pd.DataFrame(fruitData, index=fruitData['fruitName'], columns=['num', 'fruitPrice'])
print(fruitFrame)
print('=' * 40)

tempData = pd.DataFrame([[7, 2500]], index=['grape'], columns=['num', 'fruitPrice'])
fruitFrame = fruitFrame.append(tempData) # append 값 반환
print(fruitFrame)
print('-' * 40)

tempData = {'num':15, 'fruitPrice':4500} # dict
print(fruitFrame.append(tempData, ignore_index=True)) # 인덱스 무시
print('-' * 40)

fruitFrame.loc['melon'] = tempData # loc 할당
print(fruitFrame)

'Python' 카테고리의 다른 글

Python Pandas 정렬 (0)	2021.07.03
Python Pandas 기본 연산 (0)	2021.07.03
Python Direct kernel connection broken 에러 (0)	2021.06.30
Python NumPy 슬라이스, 통계 (0)	2021.06.28
Python NumPy(Numerical Python) (0)	2021.06.27

'Python' Related Articles

Comments

기계는 거짓말하지 않는다

Python Pandas(Panel Data) 본문

Python Pandas(Panel Data)

Pandas

Install

Series

DataFrame

'Python' 카테고리의 다른 글

티스토리툴바