Introduction to Data Analytics

UTS

Course Description

  • Course Name

    Introduction to Data Analytics

  • Host University

    UTS

  • Location

    Sydney, Australia

  • Area of Study

    Computer Engineering, Computer Info Systems, Computer Programming, Computer Science, Information Sciences, Information Technologies, Management of Technology

  • Language Level

    Taught In English

  • Course Level Recommendations

    Upper

    ISA offers course level recommendations in an effort to facilitate the determination of course levels by credential evaluators.We advice each institution to have their own credentials evaluator make the final decision regrading course levels.

    Hours & Credits

  • Credit Points

    6
  • Recommended U.S. Semester Credits
    4
  • Recommended U.S. Quarter Units
    6
  • Overview

    Description
    Data analytics is the art and science of turning large quantities of usually incomprehensive data into meaningful and commercially valuable information. It is the basis of modern computer analytics and intelligence. It includes a number of IT areas, such as statistical methods for identifying patterns in data and making inferences; database technologies for managing the data sets to be mined; a range of intelligent technologies that derive automatically patterns from data; and visualisation and other multimedia techniques that support human pattern discovery capabilities. This subject offers the foundations of data analytics, data mining and knowledge discovery methods and their application to practical problems. It brings together the state-of-the-art research and practical techniques in data analytics, providing students with the necessary knowledge and capacity to initiate and conduct data mining research and development projects, and professionally communicate with analytics experts.
    Subject objectives
    Upon successful completion of this subject students should be able to:
    1. Explain the background of data analytics including the business and society context;
    2. Use data analytics to explore and gain a broad understanding of a dataset;
    3. Outline the scope and limitations of several state-of-the-art data analytics methods;
    4. Use data analytics methods to make predictions for a dataset;
    5. Learn the fundamental knowledge to organise and implement a data analytics project in a business environment;
    6. Communicate the results of a data analytics project.
    This subject also contributes specifically to the development of the following course intended learning outcomes:
    Identify, interpret and analyse stakeholder needs. (A.1)
    Apply systems thinking to understand complex system behaviour including interactions between components and with other systems (social, cultural, legislative, environmental, business etc.) (A.5)
    Identify and apply relevant problem solving methodologies (B.1)
    Design components, systems and/or processes to meet required specifications (B.2)
    Synthesise alternative/innovative solutions, concepts and procedures (B.3)
    Communicate effectively in ways appropriate to the discipline, audience and purpose. (E.1)
    Work as an effective member or leader of diverse teams within a multi-level, multi-disciplinary and multi-cultural setting (E.2)
    Teaching and learning strategies
    Subject presentation includes combined lecture and laboratory sessions (3 hours), and research and development work for the assignments. Lectures will present the theoretical aspects of data mining, including guest lectures about case studies of real-world business applications of data mining techniques. The laboratory sessions are conducted in the lab and require substantial preparation from the students. They will focus on hands-on experience in data mining and data analytics tools, and understanding and interpretation of the results. Practical assignments can be performed anywhere, the labs will provide the tools necessary to complete these assignments.
    Content
    The subject will cover topics from the following:
    Introduction to data mining: problems; data mining concepts, types of data that we collect, the data mining and knowledge discovery process (CRISP DM methodology, SAS SEMMA Methodology), differences between data mining and knowledge discovery, what can be discovered; the concepts of 'interestingness', usefulness' and 'novelty' of discovered patterns; overview of application areas, the data mining professional.
    Visual data exploration and mining: data visualisation techniques and their applicability in data mining, visual data mining methods.
    Data pre-processing and transformation: problems; small and large data sets; missing data and dealing with it; noisy data and sampling; missing data; techniques for data cleaning; techniques for removing sensitive information, legal issues.
    Association Rules Mining: problems; frequent item sets; mining single-dimensional Boolean association rules (A-priori algorithm and its modifications); mining multilevel association rules; mining multidimensional association rules; constraint-based association mining; association rules representation and visualisation; applications in basket data analysis, cross-marketing, catalogue design.
    Classification and Prediction: problems for classification and prediction; classification by decision tree induction; Bayesian classification (Naïve Bayesian classifier, Bayesian networks); classification by backpropagation (limited coverage of neural network techniques); classification based on concepts from association rule mining and other methods; classification accuracy; issues in prediction; applications in medical diagnosis, credit approval, target marketing, medical diagnosis, DNA microarray analysis, treatment effectiveness analysis.
    Clustering: problems for cluster analysis; types of data; partitioning methods, hierarchical methods; density-based methods; grid-based methods; model-based clustering methods; outlier detection and analysis.
    Introduction to spatial and multimedia data mining: spatial data characteristics; mining spatial associations; spatial classification and trend analysis; clustering with spatial obstacles; image content analysis; content-based-indexing based on colour histograms, texture, shape, objects, and wavelet transforms; integrated mining of multimedia content; applications in state and region analysis, analysis of climate changes.
    Introduction to mining sequential data: trend and periodicity analyses; similarity search in time series; subsequence matching; sequential pattern mining; applications to financial markets, bioinformatics and Web mining.
    Deployment of results: representing patterns as rules, functions, cases; model deployment; industry applications.
    Assessment
    Assessment task 1: The data mining consultant
    Objective(s):
    This assessment task addresses the following subject learning objectives:
    1 and 5
    This assessment task contributes to the development of the following course intended learning outcomes:
    A.1, A.5, B.2 and E.1
    Type: Report
    Groupwork: Individual
    Weight: 35%
    Length:
    The task requires submission of a report of 10 pages in an 11 or 12 point font.
    Criteria linkages:
    Criteria Weight (%) SLOs CILOs
    Research into the background of the chosen problem 33 1 A.1, E.1
    Demonstrated understanding of the data analytics problem being solved 33 1, 5 A.1, A.5, E.1
    Quality and feasibility of the developed approach. 34 1, 5 A.1, B.2, E.1
    SLOs: subject learning objectives
    CILOs: course intended learning outcomes
    Assessment task 2: Data exploration and preparation
    Objective(s):
    This assessment task addresses the following subject learning objectives:
    2 and 3
    This assessment task contributes to the development of the following course intended learning outcomes:
    B.1, B.2, B.3 and E.1
    Type: Report
    Groupwork: Individual
    Weight: 25%
    Length:
    A report of about 20 pages.
    Criteria linkages:
    Criteria Weight (%) SLOs CILOs
    Correctness and understanding of pre-processing and transformation steps 33 2, 3 B.1, B.2, B.3
    Depth of understanding of the data exploration 33 2 B.1, B.2, B.3, E.1
    Quality of the communication of results. 34 2 E.1
    SLOs: subject learning objectives
    CILOs: course intended learning outcomes
    Assessment task 3: Data mining in action
    Objective(s):
    This assessment task addresses the following subject learning objectives:
    2, 3, 4 and 6
    This assessment task contributes to the development of the following course intended learning outcomes:
    A.1, B.1, B.2, B.3, E.1 and E.2
    Type: Report
    Groupwork: Group, group and individually assessed
    Weight: 40%
    Length:
    A report of around 40-50 pages.
    Criteria linkages:
    Criteria Weight (%) SLOs CILOs
    Individual - quality of results 17 2, 3, 4 B.1, B.2
    Individual - depth of understanding 17 2, 3, 4 B.1, B.3
    Individual - communication of the models constructed by the individual 17 3, 4, 6 E.1
    Group - quality of the group decision of the final model 17 4, 6 A.1, B.1, B.2, B.3, E.2
    Group - interpretation of the results 17 4, 6 E.2
    Group - quality of the presentation 15 6 E.1, E.2
    SLOs: subject learning objectives
    CILOs: course intended learning outcomes
    Minimum requirements
    The assignment due dates are shown in the Subject Schedule. To pass the subject a student must gain a mark of over 50.
    A late penalty of up to 50% may be applied to submitted work unless prior arrangements have been made with the subject coordinator. Details of the late penalties will be included with the descriptions of the assessment item(s).

Course Disclaimer

Courses and course hours of instruction are subject to change.

Credits earned vary according to the policies of the students' home institutions. According to ISA policy and possible visa requirements, students must maintain full-time enrollment status, as determined by their home institutions, for the duration of the program.