Syllabus B Tech Computer Science Seventh Semester Data Science And Big data CS7005

Computer-Science-Engineering-7

Syllabus B Tech Computer Science Seventh Semester Data Science And Big Data CS7005

The concepts developed in this course will aid in quantification of several concepts in Computer Science Engineering that have been introduced at the Engineering courses. Technology is being increasingly based on the latest Syllabus B Tech Computer Science Seventh Semester Data Science And Big Data CS7005 is given here.

The objective of this course Syllabus B Tech Computer Science Seventh Semester Data Science And Big Data CS7005 is to develop ability and gain insight into the process of problem-solving, with emphasis on thermodynamics. Specially in following manner: Apply conservation principles (mass and energy) to evaluate the performance of simple engineering systems and cycles. Evaluate thermodynamic properties of simple homogeneous substances. Analyze processes and cycles using the second law of thermodynamics to determine maximum efficiency and performance. Discuss the physical relevance of the numerical values for the solutions to specific engineering problems and the physical relevance of the problems in general and Critically evaluate the validity of the numerical solutions for specific engineering problems. More precisely, the objectives are:

  • To enable young technocrats to acquire mathematical knowledge to understand Laplace transformation, Inverse Laplace transformation and Fourier Transform which are used in various branches of engineering.
  • To introduce effective mathematical tools for the Numerical Solutions algebraic and transcendental equations.
  • To acquaint the student with mathematical tools available in Statistics needed in various field of science and engineering.

CS 7005 – Data Science And Big Data

Unit 1
Understanding Data: Data Wrangling and Exploratory Analysis, Data Transformation & Cleaning, Feature Extraction, Data Visualization. Introduction to contemporary tools and programming languages for data analysis like R and Python.
Unit 2
Statistical & Probabilistic analysis of Data: Multiple hypothesis testing, Parameter Estimation methods, Confidence intervals, Bayesian statistics and Data Distributions.
Unit 3
Introduction to machine learning: Supervised & unsupervised learning, classification & clustering Algorithms, Dimensionality reduction: PCA & SVD, Correlation & Regression analysis, Training & testing data: Overfitting & Under fitting.
Unit 4
Introduction to Information Retrieval: Boolean Model, Vector model, Probabilistic Model, Text based search: Tokenization, TF-IDF, stop words and n-grams, synonyms and parts of speech tagging.
Unit 5
Introduction to Web Search& Big data: Crawling and Indexes, Search Engine architectures, Link Analysis and ranking algorithms such as HITS and PageRank, Hadoop File system & Map Reduce Paradigm.

Books Recommended

1. Peter Bruce, “Practical Statistics for Data Scientists: 50 Essential Concepts”, Shroff/O’Reilly; First edition, 2017
2. Pang-Ning Tan, “Introduction to Data Mining”, Pearson Edu.
3. Ricardo Baeza – Yates and Berthier Ribeiro-Neto, “Modern Information Retrieval”, Pearson Education.