Page 78 - B.E CSE Curriculum and Syllabus R2017 - REC
P. 78

Department of CSE, REC



               2.  Kai  Hwang,  Geoffery  C.  Fox  and  Jack  J.  Dongarra,  Distributed  and  Cloud  Computing:  Clusters,
                   Grids,  Clouds  and  the  Future  of  Internet,  First  Edition,  Morgan  Kaufman  Publisher, an  Imprint  of
                   Elsevier, 2012.
            REFERENCES:
               1.  Michael  J.  Kavis  Architecting  the  Cloud:  Design  Decisions  for  Cloud  Computing  Service  Models
                   (SaaS, PaaS, and IaaS), First Edition, Wiley.
               2.  Tom White, Hadoop: The Definitive Guide, Yahoo Press, 2014.
               3.  Rajkumar  Buyya,  Christian  Vecchiola,  and  Thamarai  Selvi,  Mastering  Cloud  Computing,  Tata
                   McGraw Hill, 2013.
               4.  John W.Rittinghouse and James F.Ransome, Cloud Computing: Implementation, Management, and
                   Security, CRC Press, 2010.

            IT17701                                  DATA ANALYTICS                               L T P C
                                         (Common to B.E. CSE and B.Tech. IT)                                   3  0 0 3

            OBJECTIVES:
                 To introduce the concepts of Big Data and Hadoop
                 To help understand HDFS and Map reduce concepts
                 To imbibe the Hadoop Eco System of NoSQL
                 To describe the data stream analytics methodologies
                 To narrate various data analysis techniques
            UNIT I        INTRODUCTION TO BIG DATA AND HADOOP                                             6
            Introduction to Big Data, Types of Digital Data, Challenges of conventional systems - Web data, Evolution of
            analytic processes and tools, Analysis Vs reporting - Big Data Analytics, Introduction to Hadoop - Distributed
            Computing Challenges - History of Hadoop, Hadoop Eco System.

            UNIT II       HDFS (HADOOP DISTRIBUTED FILE SYSTEM) AND MAP REDUCE                               6
            Hadoop Overview – Use case of Hadoop – Hadoop Distributors – HDFS – Processing Data with Hadoop –
            Map  Reduce  -  Managing  Resources  and  Applications  with  Hadoop  YARN  –  Interacting  with  Hadoop
            Ecosystem.

            UNIT III      NOSQL DATABASES                                                                12
            NoSQL - Pig - Introduction to Pig, Execution Modes of Pig, Comparison of Pig with Databases, Grunt, Pig
            Latin, User Defined Functions, Data Processing operators - Hive - Hive Shell, Hive Services, Hive Metastore,
            Comparison with Traditional Databases, HiveQL, Tables, Querying – MongoDB - Needs-Terms-Data Types-
            Query Language – Cassandra -Introduction-Features-Querying Commands.

            UNIT IV   MINING DATA STREAMS                                                                9
            Introduction to Streams Concepts – Stream data model and architecture - Stream Computing, Sampling data in
            a stream – Filtering streams – Counting distinct elements in a stream – Estimating moments – Counting
            oneness in a window – Decaying window – Real time Analytics Platform(RTAP) applications - case studies –
            real time sentiment analysis, stock market predictions.
            UNIT V     DATA ANALYSIS AND VISUALIZATION                                                   12
            Regression modelling, Multivariate analysis, Decision Trees, Support vector and kernel methods, Neural
            networks: learning and generalization, competitive learning, principal component analysis and neural
            networks; Clustering Techniques – Hierarchical – K- Means – Clustering high dimensional data –
            Frequent pattern based clustering methods – Clustering in Non-Euclidean space – Clustering for streams
            and Parallelism- Visualization - Time series analysis.
                                                                                           TOTAL: 45 PERIODS
            OUTCOMES:
            At the end of the course, student will be able to:
                 understand the usage scenarios of Big Data Analysis and Hadoop framework
                 Apply Mapreduce over HDFS
            Curriculum and Syllabus | B.E. Computer Science and Engineering | R2017                    Page 78
   73   74   75   76   77   78   79   80   81   82   83