CSV faili ir ar komatu atdalīti faili. Lai piekļūtu datiem no CSV faila, mums ir nepieciešama funkcija read_csv() no Pandas, kas izgūst datus datu rāmja formā.
Read_csv() sintakse
Šeit ir Pandas lasa CSV sintakse ar tās parametriem.
Sintakse: pd.read_csv (filepath_or_buffer, sep=’ ,’ , header=’infer’, index_col=Nav, usecols=None, engine=None, skiprows=None, nrows=None)
Parametri:
- faila ceļš_vai_buferis : csv faila atrašanās vieta. Tā pieņem jebkuru faila virknes ceļu vai URL.
- sept : tas apzīmē atdalītāju, noklusējuma vērtība ir ', '.
- galvene : tā pieņem int, int sarakstu, rindu numurus, ko izmantot kā kolonnu nosaukumus, un datu sākumu. Ja netiek nodots neviens nosaukums, t.i., header=None, tad tā parādīs pirmo kolonnu kā 0, otro kā 1 un tā tālāk.
- izmantot kolas : izgūst tikai atlasītās kolonnas no CSV faila.
- nrows : rindu skaits, kas jāparāda no datu kopas.
- index_col : Ja nav, kopā ar ierakstiem netiek parādīti indeksu numuri.
- lēcieni : izlaiž nodotās rindas jaunajā datu rāmī.
Lasiet CSV failu, izmantojot Pandas read_csv
Pirms šīs funkcijas izmantošanas mums ir jāimportē Pandas bibliotēku, mēs ielādēsim CSV failu, izmantojot Pandas.
PYTHON3
strint to int
# Import pandas> import> pandas as pd> # reading csv file> df>=> pd.read_csv(>'people.csv'>)> print>(df.head())> |
>
>
Izvade:
First Name Last Name Sex Email Date of birth Job Title 0 Shelby Terrell Male [email protected] 1945-10-26 Games developer 1 Phillip Summers Female [email protected] 1910-03-24 Phytotherapist 2 Kristine Travis Male [email protected] 1992-07-02 Homeopath 3 Yesenia Martinez Male [email protected] 2017-08-03 Market researcher 4 Lori Todd Male [email protected] 1938-12-01 Veterinary surgeon>
Izmantojot sept mapē read_csv()
Šajā piemērā mēs ņemsim CSV failu un pēc tam pievienosim dažas speciālās rakstzīmes, lai redzētu, kā sept parametrs darbojas.
Python3
# sample = 'totalbill_tip, sex:smoker, day_time, size> # 16.99, 1.01:Female|No, Sun, Dinner, 2> # 10.34, 1.66, Male, No|Sun:Dinner, 3> # 21.01:3.5_Male, No:Sun, Dinner, 3> #23.68, 3.31, Male|No, Sun_Dinner, 2> # 24.59:3.61, Female_No, Sun, Dinner, 4> # 25.29, 4.71|Male, No:Sun, Dinner, 4'> # Importing pandas library> import> pandas as pd> # Load the data of csv> df>=> pd.read_csv(>'sample.csv'>,> >sep>=>'[:, |_]'>,> >engine>=>'python'>)> # Print the Dataframe> print>(df)> |
>
>
Izvade:
totalbill tip Unnamed: 2 sex smoker Unnamed: 5 day time Unnamed: 8 size 16.99 NaN 1.01 Female No NaN Sun NaN Dinner NaN 2 10.34 NaN 1.66 NaN Male NaN No Sun Dinner NaN 3 21.01 3.50 Male NaN No Sun NaN Dinner NaN 3.0 None 23.68 NaN 3.31 NaN Male No NaN Sun Dinner NaN 2 24.59 3.61 NaN Female No NaN Sun NaN Dinner NaN 2 25.29 NaN 4.71 Male NaN No Sun NaN Dinner NaN 4>
Usecols izmantošana failā read_csv()
Šeit mēs norādām tikai 3 kolonnas, t.i., [First Name, Sex, Email], lai ielādētu, un mēs izmantojam galveni 0 kā noklusējuma galveni.
Python3
df>=> pd.read_csv(>'people.csv'>,> >header>=>0>,> >usecols>=>[>'First Name'>,>'Sex'>,>'Email'>])> # printing dataframe> print>(df.head())> |
>
>
Izvade:
First Name Sex Email 0 Shelby Male [email protected] 1 Phillip Female [email protected] 2 Kristine Male [email protected] 3 Yesenia Male [email protected] 4 Lori Male [email protected]>
Index_col izmantošana failā read_csv()
Šeit mēs izmantojam Sekss vispirms indekss un pēc tam Amata nosaukums indeksu, mēs varam vienkārši atkārtoti indeksēt galveni ar index_col parametrs.
Python3
ja vēl paziņojumi java
df>=> pd.read_csv(>'people.csv'>,> >header>=>0>,> >index_col>=>[>'Sex'>,>'Job Title'>],> >usecols>=>[>'Sex'>,>'Job Title'>,>'Email'>])> print>(df.head())> |
>
>
Izvade:
Email Sex Job Title Male Games developer [email protected] Female Phytotherapist [email protected] Male Homeopath [email protected] Market researcher [email protected] Veterinary surgeon [email protected]>
Nrows izmantošana failā read_csv()
Šeit mēs tikai parādām tikai 5 rindas, izmantojot nrows parametrs .
Python3
df>=> pd.read_csv(>'people.csv'>,> >header>=>0>,> >index_col>=>[>'Sex'>,>'Job Title'>],> >usecols>=>[>'Sex'>,>'Job Title'>,>'Email'>],> >nrows>=>3>)> print>(df)> |
>
>
Izvade:
Email Sex Job Title Male Games developer [email protected] Female Phytotherapist [email protected] Male Homeopath [email protected]>
Izlaižu izmantošana failā read_csv()
The lēcieni palīdziet izlaist dažas rindas CSV formātā, t.i., šeit jūs ievērosiet, ka rindas, kas minētas izlaidumos, ir izlaistas no sākotnējās datu kopas.
Python3
df>=> pd.read_csv(>'people.csv'>)> print>(>'Previous Dataset: '>)> print>(df)> # using skiprows> df>=> pd.read_csv(>'people.csv'>, skiprows>=> [>1>,>5>])> print>(>'Dataset After skipping rows: '>)> print>(df)> |
>
>
css centrālā poga
Izvade:
Previous Dataset: First Name Last Name Sex Email Date of birth Job Title 0 Shelby Terrell Male [email protected] 1945-10-26 Games developer 1 Phillip Summers Female [email protected] 1910-03-24 Phytotherapist 2 Kristine Travis Male [email protected] 1992-07-02 Homeopath 3 Yesenia Martinez Male [email protected] 2017-08-03 Market researcher 4 Lori Todd Male [email protected] 1938-12-01 Veterinary surgeon 5 Erin Day Male [email protected] 2015-10-28 Management officer 6 Katherine Buck Female [email protected] 1989-01-22 Analyst 7 Ricardo Hinton Male [email protected] 1924-03-26 Hydrogeologist Dataset After skipping rows: First Name Last Name Sex Email Date of birth Job Title 0 Shelby Terrell Male [email protected] 1945-10-26 Games developer 1 Kristine Travis Male [email protected] 1992-07-02 Homeopath 2 Yesenia Martinez Male [email protected] 2017-08-03 Market researcher 3 Lori Todd Male [email protected] 1938-12-01 Veterinary surgeon 4 Katherine Buck Female [email protected] 1989-01-22 Analyst 5 Ricardo Hinton Male [email protected] 1924-03-26 Hydrogeologist>