Pandas is a library for working with tabular data in Python.dataframes.NumPy is a Python library for working with datasets using NumPy arrays of varying dimensions.lists but faster and more powerful.
Output:
5
5
DataFrame.loc (location) method.dataframe_name['column name'] to select a column.== and the value.loc argument, specify the column to update.iloc (integer location) to select rows and columns by position.dataframe_name.iloc[row_number, column_number].[] operator to filter rows.
Output:
Patient_ID Age Cholesterol Glucose Level
count 6.000000 6.000000 6.000000 6.000000
mean 3.333333 56.666667 208.333333 107.166667
std 1.632993 9.309493 20.412415 22.003788
min 1.000000 45.000000 180.000000 90.000000
25% 2.250000 51.250000 200.000000 95.750000
50% 3.500000 55.000000 205.000000 99.000000
75% 4.750000 62.500000 217.500000 107.500000
max 5.000000 70.000000 240.000000 150.000000
pandas_split.py
patient_data_2['Diagnosis'] == 'Hypertension' finds all rows where the diagnosis is hypertension.['Cholesterol'] returns only the cholesterol values for the rows that are filtered by the above query.shape attribute to find the shape of an array.
Output:
(2, 3)
Output:
[10 2 2 4 1 1 7]
stats.ttest_ind.Output:
t-score: 2.5
p-value: 0.05
Note:
lesson_2.ipynb.