Page 25 - REC :: M.E. CSE Curriculum and Syllabus - R2019
P. 25
CP19P06 BIG DATA ANALYTICS Category L T P C
PE 3 0 0 3
Objectives:
Gaining factual knowledge regarding data acquisition, data cleansing, and various aspects of data analytics and
⚫
visualization
⚫ Learning the principles of data analytics and its underlying methods and algorithms
Learning to apply the methods of Distributed data storage and processing using Hadoop related tools and Map
⚫
Reduce Concepts
⚫ Understand the necessity of Streaming Data Analysis and its applications
Developing the skills necessary to use related software tools to perform data collection, cleansing, and
⚫
analytics
UNIT-I BIG DATA ANALYTICS 9
Introduction to Big Data Analytics, Data Structures, BI Vs Analytics, Analytic Architecture, Data Analytics Life
Cycle, R Language for Data Analytics, Basic Features, Data Import and Export, Descriptive Statistics, Predictive
Analytics
UNIT-II ANALYTICAL THEORY 9
Overview of Clustering, Classification and Correlation, K-means, Supervised and Unsupervised Learning, Linear,
Logistics and Lasso Regression, Bayesian Modelling, Time Series Analysis, Association Analysis and Cluster
Analysis.
UNIT-III HADOOP ECOSYSTEM 9
Hadoop Stack for Big Data, Processing Data with Hadoop, HDFS, Hadoop MapReduce 2.0, Job Scheduling, Shuffle
and sort, Hadoop Related Technologies: Hive, Mahout, Zookeeper, HBase, and Cassandra.
UNIT-IV STREAMING DATA ANALYTICS 9
Introduction to Streams Concepts – Stream data model and architecture - Stream Computing, Sampling data in a
stream – Filtering streams – Counting distinct elements in a stream – Estimating moments – Counting oneness in a
window – Decaying window - Realtime Analytics Platform(RTAP) applications - case studies - real time sentiment
analysis, stock market predictions.
UNIT-V ADVANCED TOOLS FOR ANALYTICS 9
Stream Analytics using Apache Spark and Flink, Graph Database using Neo4J, Applications of Spark ML library, In-
Memory Databases: VoltDB, SciDB, Data Analytics in Cloud: Tableau, AWS Kinesis, and AWS EMR.
Total Contact Hours : 45
Course Outcomes:
Upon completion of the course, students will be able to:
⚫ Analyze the importance of analytics and identify the features of it.
⚫ Understands different types of supervised and unsupervised learning algorithms
⚫ Examine the implementation techniques for big data analysis.
⚫ Implement the streaming data sets in stream processors
⚫ Learn various tools to execute datasets in real-time.
Reference Books(s) :
EMC Education Services, “Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and
1
Presenting Data”, Wiley, 2015.
Jen Stirrup, and Ruben Oliva Ramos, “Advanced Analytics with R and Tableau“, Packt Publishing Limited,
2
2017.
3 Anand Rajaraman and Jeffrey David Ullman, Mining of Massive Datasets, Cambridge University Press, 2012.

