Page 537 - Python Data Science Handbook
P. 537

colon (:), 44                           as dictionary, 110-112
               color compression, 473-476              as generalized NumPy array, 102
               colorbars                               as specialized dictionary, 103
                  colormap selection, 256-259          as two-dimensional array, 112-114
                  customizing, 255-262                 constructing, 104
                  discrete, 260                        data selection in, 110
                  handwritten digit example, 261-262   defined, 97
               colormap, 256-259                       index alignment in, 117
               column(s)                               masking, 114
                  accessing single, 45                 multiply indexed, 136
                  indexing, 163                        operations between Series object and, 118
                  MultiIndex for, 133                  slicing, 114
                  sorting arrays along, 87           DataFrame.eval() method, 211-213
                  suffixes keyword and overlapping names,  assignment in, 212
                    153                                local variables in, 213
               column-wise operations, 211-213       DataFrame.query() method, 213
               command history shortcuts, 9          datasets
               comparison operators, 71-73             appending, 146
               concatenation                           combining (Panda), 141-158
                  datasets, 141-146                    concatenation, 141-146
                  of arrays, 48, 142                   merging/joining, 146-158
                  with pd.concat(), 142-146          datetime module, 189
               confusion matrix, 357                 datetime64 dtype, 189
               conic projections, 303                dateutil module, 189
               contour plots, 241-245                debugging, 22-24
                  density and, 241-245               decision trees, 421-426
                  three-dimensional function, 241-245  (see also random forests)
                  three-dimensional plot, 292          creating, 422-425
               Conway, Drew, xi                        overfitting, 425
               cross-validation, 361-370             deep learning, 513
               cubehelix colormap, 258               density estimator
               cylindrical projections, 301            GMM, 484-488
                                                       histogram as, 492
               D                                       KDE (see kernel density estimation (KDE))
               data                                  describe() method, 164
                  as arrays, 33                      development, IPython
                  missing (see missing data)           profiling and timing code, 25-30
               data representation (Scikit-Learn package),  profiling full scripts, 27
                  343-346                              timing of code snippets, 25-27
                  data as table, 343                 dictionary(-ies)
                  features matrix, 344                 DataFrame as specialization of, 103
                  target array, 344-345                DataFrame object constructed from list of,
               data science, defining, xi                 104
               data types, 34                          Pandas Series object vs., 100
                  fixed-type arrays, 38              digits, recognition of (see optical character rec‐
                  integers, 35                         ognition)
                  lists in, 37-41                    dimensionality reduction, 261
                  NumPy, 41                            machine learning, 340-342
               DataFrame object (Pandas), 102-105      PCA and, 433


                                                                              Index  |  519
   532   533   534   535   536   537   538   539   540   541   542