Zicklin School of Business – Baruch College
City University of New York
CIS 9760 – Big Data Technologies
Spring 2025 – Section FMWA – Hybrid/Synchronous
Monday/Wednesday 4:10 – 5:25 pm
DRAFT Syllabus
Professor | Dr. Richard Holowczak Phone: 646-312-3371 Office Hours: Monday and Wednesday, 12:30pm – 1:30pm, or by appointment E-Mail: [email protected] (Preferred) Please put the following in the Subject line for any e-mail to me: CIS 9760 followed by the specific subject of your e-mail. |
|||||||||||||||||||||
Instructional Modality | Section FMWA will be Hybrid for Spring 2025. Monday – In Person in Room NVC 7-150 Wednesday – On-Line – Lectures will be Synchronous. |
|||||||||||||||||||||
Objectives | This course will give students an overview of the big data technologies that will help efficiently store, extract, and process very large datasets. Students will learn key data analysis and management techniques, including critical concepts such as Distributed File Systems (storage concepts) and MapReduce/Spark(processing concepts) that power modern big data technologies. In particular, it will leverage cloud services to manage storage and efficiently process data. Further, the course will also show how big data technologies can be used to effectively analyze large volumes of data for practical applications. | |||||||||||||||||||||
Course Learning Goals |
Upon successful completion of this course, students will be able to:
|
|||||||||||||||||||||
Prerequisites | Pre-requisite: CIS 9650 – Programming for Analytics Suggested: CIS 9660 – Data Mining for Business Analytics |
|||||||||||||||||||||
Textbooks / Materials / Resources |
|
|||||||||||||||||||||
Course Content | There will be 7 homework assignments including: Google Cloud Platform exercises DataCamp: Big Data with PySpark Skill Track courses There will be an in-person Midterm Exam and Final Exam (not cumulative). An individual machine learning Project will be due at the end of the semester. Students are expected to spend a significant amount of time outside the classroom learning to use Google Cloud Platform and programming with PySpark. |
|||||||||||||||||||||
Grading |
|
|||||||||||||||||||||
This is a tentative grading schedule and is subject to change. Credit will not be given for assignments submitted after homework solutions are discussed. There will be no extra credit assignments. |
||||||||||||||||||||||
Semester Project |
Students will complete an individual machine learning project during the semester. The project will have six milestones due throughout the semester:
|
Topics / Schedule (Tentative)
The following table gives a tentative lecture schedule for the course.
Week | Topics | Google Cloud / DataCamp | Project Milestones |
---|---|---|---|
1 | Course Introduction Introduction to BigData |
DataCamp: Understanding Data Engineering (Due 2/3) | |
2 | Python Review Introduction to Machine Learning |
Google Cloud Platform Exercises (Due 2/17) | Milestone 1 Proposal (Due 2/21) |
3 | Introduction to Cloud Computing | Continued | |
4 | Cloud Computing – Compute Services GCP Compute Engine and working with Command line. |
DataCamp: Introduction to PySpark (Due 2/28) | |
5 | Cloud Computing – Cloud Storage Services | Continued | Milestone 2 Data Acquisition (Due 3/3) |
6 | Hadoop – Architecture and HDFS | DataCamp: BigData Fundamentals with PySpark (Due 3/14) | |
7 | Hadoop – MapReduce and YARN GCP DataProc |
Continued | |
8 | Catch up class Review for Midterm Exam |
DataCamp: Cleaning Data with PySpark (Due 3/28) | |
9 | Midterm Exam (3/24 Tentative) Spark Architecture and PySpark |
Continued | Milestone 3 EDA / Data Cleaning (Due 3/28) |
10 | Spark: RDDs, DataFrames, DataSets, SparkSQL | DataCamp: Feature Engineering with PySpark (Due 4/11) | |
11 | Spark: Feature Engineering with PySpark | Continued | |
12 | Spark: Machine Learning with MLIB | DataCamp: Machine Learning with PySpark (Due 4/25) | Milestone 4 FE and Modeling (Due 4/18) |
13 | Spark: Pipelines and Cross Validation | (Optional) DataCamp: Building Rec. Engines with PySpark |
Milestone 5 Data Visualization (Due 5/2) |
14 | Spark: Streaming | ||
15 | Scripting, Automation and MLOps Final Exam Review |
Milestone 6 Final Report (Due 5/16) | |
16 TBD |
Final Exam to be held during 2 hour final exam period. |
Please note that this schedule is subject to change. Students are expected to come to class prepared and ready to participate.
Google Cloud Platform |
This course has been structured around Google Cloud Platform which will be used extensively for demonstrations, homework and projects. Students are responsible for monitoring their services and billing statements, and for setting up budgets and alerts to prevent extensive charges.
|
||||||||||||||||||||||||||||||||||||
Final Letter Grades |
Letter grades are calculated according to the Official Grading System of Baruch College. The instructor reserves the right to curve the scale when computing final grades, if deemed necessary.
|
||||||||||||||||||||||||||||||||||||
Grade Distribution |
The Paul H. Chook Department of Information Systems and Statistics expects to see a reasonable distribution of grades in each class. For graduate courses this distribution is:
Due to these guidelines, the professor reserves the right to curve final letter grades up or down. |
||||||||||||||||||||||||||||||||||||
Academic Integrity Statement |
I fully support Baruch College’s policy on Academic Honesty, which states, in part: “Academic dishonesty is unacceptable and will not be tolerated. Cheating, forgery, plagiarism and collusion in dishonest acts undermine the college’s educational mission and the students’ personal and intellectual growth. Baruch students are expected to bear individual responsibility for their work, to learn the rules and definitions that underlie the practice of academic integrity, and to uphold its ideals. Ignorance of the rules is not an acceptable excuse for disobeying them. Any student who attempts to compromise or devalue the academic process will be sanctioned.” Academic sanctions in this class will range from an F on the assignment to an F in this course. A report of suspected academic dishonesty will be sent to the Office of the Dean of Students. Additional information and definitions can be found at https://provost.baruch.cuny.edu/teaching-learning-student-success/academic_honesty/ The use of AI (ChatGPT and similar) for coursework and assignments is strictly prohibited. This includes, but is not limited to, the use of AI-generated text, speech, programming code, diagrams or images, as well as the use of AI tools or software to complete any portion of a project, assignment or exam. Any use of AI tools to complete your work or a portion of your work will result in a grade of 0. |
||||||||||||||||||||||||||||||||||||
Statement on Lecture Recording |
Students who participate in this class with their camera on or use a profile image are agreeing to have their video or image recorded solely for the purpose of creating a record for students enrolled in the class to refer to, including those enrolled students who are unable to attend live. If you are unwilling to consent to have your profile or video image recorded, be sure to keep your camera off and do not use a profile image. Likewise, students who un-mute during class and participate orally are agreeing to have their voices recorded. If you are not willing to consent to have your voice recorded during class, you will need to keep your mute button activated and communicate exclusively using the “chat” feature, which allows students to type questions and comments live. |
||||||||||||||||||||||||||||||||||||
Baruch College Counseling Center |
At Baruch, we acknowledge that as a student, you are balancing many demands. During the semester, if you start to experience personal difficulties or stressors that are interfering with your academic performance or day to day functioning, please consider seeking free and confidential support at the Baruch College Counseling Center. For more information or to make an appointment, please visit their website at studentaffairs.baruch.cuny.edu/counseling/ or call 646-312-2155. If it is outside of business hours (Monday-Friday 9-5pm) and you need immediate assistance, please call 1-888-NYC-WELL (888-692-9355). If you are concerned about one of your classmates, please share that concern by filling out a Campus Intervention Team form at studentaffairs.baruch.cuny.edu/campus-intervention-team. |
||||||||||||||||||||||||||||||||||||
Students with Disabilities | Students with disabilities may receive assistance and accommodation of various sorts to enable them to participate fully in courses at Baruch. To establish the accommodations appropriate for each student, please alert me to your needs and contact the Office of Services for Students with Disabilities, part of the Division of Student Development and Counseling. For more information contact the Director of this office in NVC 2-271 or at (646) 312 4590. | ||||||||||||||||||||||||||||||||||||
Additional Notes |
|
||||||||||||||||||||||||||||||||||||
Important Dates |
Baruch Academic Calendar for Spring 2025
- January 25 Saturday Official start of the Spring Semester - January 27 Monday First Class session for CIS 9760 - January 29 Wednesday No Class - February 12 Wednesday No Class - February 17 Monday No Class - February 18 Tuesday Classes follow Monday schedule - March 6 Thursday Classes follow Wednesday schedule - March 31 Monday No Class - April 1 Tuesday Last day to withdraw with "W" grade - April 12-20 Spring Recess - April 14 Monday No Class - April 16 Wednesday No Class - May 14 Wednesday Last class for CIS 9760 - May 16-22 Final exams - May 27 Tuesday Final Grades Submitted |
.