Page 538 - Python Data Science Handbook
P. 538

discriminative classification, 405-407  modifying values with, 82
               documentation, accessing                selection of random points, 81
                  IPython, 3-8, 98                   feature engineering, 375-382
                  Pandas, 98                           categorical features, 376
               double question mark (??), 5            derived features, 378-380
               dropna() method, 125                    image features, 378
               dynamic typing, 34                      imputation of missing data, 381
                                                       processing pipeline, 381
               E                                       text features, 377
               eigenfaces, 442-445                   feature, data point, 334
               ensemble estimator/method, 421        features matrix, 344
                  (see also random forests)          fillna() method, 126
               ensemble learner, 421                 filter() method, 166
               equidistant cylindrical projection, 301  FiveThirtyEight stylesheet, 287
               errors, visualizing                   fixed-type arrays, 38
                  basic errorbars, 238
                  continuous quantities, 239         G
                  Matplotlib, 237-240                Gaussian basis functions, 394-396
               Estimator API, 346-359                Gaussian mixture models (GMMs), 476-491
                  basics, 347                          choosing covariance type, 484
                  Iris classification example, 351     clustering with, 353
                  Iris clustering example, 353         density estimation algorithm, 484-488
                  Iris dimensionality example, 352     E–M generalization, 480-484
                  simple linear regression example, 347-354  handwritten data generation example,
               eval() function, 210-211                   488-491
                  DataFrame.eval() method and, 211-213  k-means weaknesses addressed by, 477-480
                  pd.eval() function and, 210-211      KDE and, 491
                  when to use, 214                   Gaussian naive Bayes classification, 351, 357,
               exceptions, controlling, 20-22          383-386, 510
               expectation-maximization (E-M) algorithm  Gaussian process regression (GPR), 239
                  caveats, 467-470                   generative models, 383
                  GMM as generalization of, 480-484  geographic data, 298
                  k-means clustering and, 465-476      Basemap toolkit for, 298
               exponentials, 55                        California city population example, 308
               external code, magic commands for running,  drawing a map background, 304-307
                  12                                   map projections, 300-304
                                                       plotting data on maps, 307
               F                                       surface temperature data example, 309
               face recognition                      get() operation, 183
                  HOG, 506-514                       get_dummies() method, 183
                  Isomap, 456-460                    ggplot stylesheet, 287
                  PCA, 442-445                       graphics libraries, 330
                  SVMs, 416-420                      GroupBy aggregation, 170
               faceted histograms, 318               GroupBy object, 163-165
               factor plots, 319                       aggregate() method, 166
               fancy indexing, 78-85                   apply() method, 167
                  basics, 79                           column indexing, 163
                  binning data, 83                     dispatch methods, 164
                  combined with other indexing schemes, 80  filter() method, 166


               520  |  Index
   533   534   535   536   537   538   539   540   541   542   543