Master of Data Science
- CRICOS Code: 092791B
What will I study?
The Master of Data Science is a 200-point program, made up of:
- Core statistics subjects (50 points)
- Core computer science subjects (50 points)
- Elective subjects (75 points), including prerequisite subjects if needed (up to 50 points), data science or professional skills subjects or a research project.
- Capstone data science project (25 points)
Your elective subjects will be tailored to you, depending on your previous academic background and your interests.
First, you’ll need to look at whether you need any prerequisite subjects:
- If you have a statistics background, you’ll complete computer science prerequisite subjects.
- If you have a computer science background, you’ll complete statistics prerequisite subjects.
- If you meet both the computer science and statistics prerequisites, no prerequisite subjects are needed.
If you are coming from the Graduate Diploma in Data Science or a University of Melbourne Data Science undergraduate major (or equivalent), you may be eligible for an accelerated 1.5-year (150-point) program, receiving up to 50 points of credit.
Once you’ve taken care of any prerequisites, you can choose from a diverse list of data science or professional skills electives. If you’d like to gain experience in a science and technology workplace or with research, you can do an 80–100-hour internship subject or take on an additional research project.
All students undertake a data science capstone project, over one academic year, working on a practical data science research question either individually or as part of a team.
Explore this course
Explore the subjects you could choose as part of this degree.
Computer science background
If you have a computer science background, you may need to take the following subjects (as part of your elective component, to satisfy the prere.uisites for the statistics core subjects):
- Methods of Mathematical Statistics 25 pts
This subject introduces probability and the theory underlying modern statistical inference. Properties of probability are reviewed, univariate and multivariate random variables are introduced, and their properties are developed. It demonstrates that many commonly used statistical procedures arise as applications of a common theory. Both classical and Bayesian statistical methods are developed. Basic statistical concepts including maximum likelihood, sufficiency, unbiased estimation, confidence intervals, hypothesis testing and significance levels are discussed. Computer packages are used for numerical and theoretical calculations.
- A First Course In Statistical Learning 25 pts
Supervised statistical learning is based on the widely used linear models that model a response as a linear combination of explanatory variables. Initially this subject develops an elegant unified theory for a quantitative response that includes the estimation of model parameters, hypothesis testing using analysis of variance, model selection, diagnostics on model assumptions, and prediction. Some classification methods for qualitative responses are then developed. This subject then considers computational techniques, including the EM algorithm. Bayes methods and Monte-Carlo methods are considered. The subject concludes by considering some unsupervised learning techniques.
If you have a statistics background you may need to take the following subjects (as part of your elective component, to satisfy the prerequisites for the computer science core subjects).
- Programming and Software Development 12.5 pts
The aims for this subject is for students to develop an understanding of approaches to solving moderately complex problems with computers, and to be able to demonstrate proficiency in designing and writing programs. The programming language used is Java.
Topics covered will include:
- Java basics
- Console input/output
- Control flow
- Defining classes
- Using object references
- Programming with arrays
- Polymorphism and abstract classes
- Exception handling
- UML basics
- Algorithms and Complexity 12.5 pts
The aim of this subject is for students to develop familiarity and competence in assessing and designing computer programs for computational efficiency. Although computers manipulate data very quickly, to solve large-scale problems, we must design strategies so that the calculations combine effectively. Over the latter half of the 20th century, an elegant theory of computational efficiency developed. This subject introduces students to the fundamentals of this theory and to many of the classical algorithms and data structures that solve key computational questions. These questions include distance computations in networks, searching items in large collections, and sorting them in order.
Topics covered include complexity classes and asymptotic notation; empirical analysis of algorithms; abstract data types including queues, trees, priority queues and graphs; algorithmic techniques including brute force, divide-and-conquer, dynamic programming and greedy approaches; space and time trade-offs; and the theoretical limits of algorithm power.
- Elements of Data Processing 12.5 pts
Data processing is fundamental to computing and data science. This subject gives an introduction to various aspects of data processing including database management, representation and analysis of data, information retrieval, visualisation and reporting, and cloud computing. This subject introduces students to the area, with an emphasis on both tools and underlying foundations.
The subject's focus is on the data pipeline, and activities known colloquially as 'data wrangling'. Indicative topics covered include:
- Capturing data (data ingress)
- Data representation and storage
- Cleaning, normalisation and filling in missing data (imputation)
- Combing multiple sources of data (data integration)
- Query languages and processing
- Scripting to support the data pipeline
- Distributing a database over multiple nodes (sharding), cloud computing file systems
- Visualisation and presentation
- Database Systems & Information Modelling 12.5 pts
The subject introduces key topics in modern information organisation, particularly with regard to structured databases. The well-founded relational theory behind modern structured query language (SQL) engines, has given them as much a place behind the web site of an organisation and on the desktop, as they traditionally enjoyed on corporate mainframes. Topics covered may include: the managerial view of data, information and knowledge; conceptual, logical and physical data modelling; normalisation and de-normalisation; the SQL language; data integrity; transaction processing, data warehousing, web services and organisational memory technologies. This is a core foundation subject for both the Master of Information Systems and Master of Information Technology.
This subject serves as an introduction to databases and data modelling from a data management perspective. Database design, from conceptual design through to physical implementation will be covered. This will include Entity Relationship modelling, normalisation and de-normalisation and SQL. Additionally the use of databases in various contexts will be explored (web based databases, connecting programs to databases, data warehousing, health contexts, geospatial databases).
Statistics core subjects
- Statistical Modelling for Data Science 12.5 pts
Statistical models are central to data science applications. Modelling approaches such as linear and generalized linear models, mixed models, and non-parametric regression are developed. Applications to time series, longitudinal, and spatial data are discussed. Methods for causal inference and handling missing data are introduced.
- Multivariate Statistics for Data Science 12.5 pts
Modern statistics and data science deals with data having multiple dimensions. Multivariate methods are used to handle these types of data. Approaches to supervised and unsupervised learning with multivariate data are discussed. In particular, methods for classification, clustering, and dimension reduction are introduced, which are particularly suited to high-dimensional data. Both parametric and nonparametric approaches are discussed.
- Computational Statistics & Data Science 12.5 pts
Computing techniques and data mining methods are indispensable in modern statistical research and data science applications, where “Big Data” problems are often involved. This subject will introduce a number of recently developed methods and applications in computational statistics and data science that are scalable to large datasets and high-performance computing. The data mining methods to be introduced include general model diagnostic and assessment techniques, kernel and local polynomial nonparametric regression, basis expansion and nonparametric spline regression, generalised additive models, classification and regression trees, forward stagewise and gradient boosting models. Important statistical computing algorithms and techniques used in data science will be explained in detail. These include the bootstrap resampling and inference, cross-validation, the EM algorithm and Louis method, and Markov chain Monte Carlo methods including adaptive rejection and squeeze sampling, sequential importance sampling, slice sampling, Gibbs sampler and Metropolis-Hastings algorithm.
Computer science core subjects
- Cluster and Cloud Computing 12.5 pts
The growing popularity of the Internet along with the availability of powerful computers and high-speed networks as low-cost commodity components are changing the way we do parallel and distributed computing (PDC). Cluster and Cloud Computing are two approaches for PDC. Clusters employ cost-effective commodity components for building powerful computers within local-area networks. Recently, “cloud computing” has emerged as the new paradigm for delivery of computing as services in a pay-as-you-go-model via the Internet. These approaches are used to tackle may research problems with particular focus on "big data" challenges that arise across a variety of domains.
Some examples of scientific and industrial applications that use these computing platforms are: system simulations, weather forecasting, climate prediction, automobile modelling and design, high-energy physics, movie rendering, business intelligence, big data computing, and delivering various business and consumer applications on a pay-as-you-go basis.
This subject will enable students to understand these technologies, their goals, characteristics, and limitations, and develop both middleware supporting them and scalable applications supported by these platforms.
This subject is an elective subject in the Master of Information Technology. It can also be taken as an Advanced Elective subject in the Master of Engineering (Software).
- Cluster computing: elements of parallel and distributed computing, cluster systems architecture, resource management and scheduling, single system image, parallel programming paradigms, cluster programming with MPI
- Utility computing: foundations and grid computing technologies
- Cloud computing: cloud platforms, Virtualization, Cloud Application Programming Models (Task, Thread, and MapReduce), Cloud applications, and future directions in utility and cloud computing
- "Big data" processing and analytics in distributed environments.
- Statistical Machine Learning 12.5 pts
With exponential increases in the amount of data becoming available in fields such as finance and biology, and on the web, there is an ever-greater need for methods to detect interesting patterns in that data, and classify novel data points based on curated data sets. Learning techniques provide the means to perform this analysis automatically, and in doing so to enhance understanding of general processes or to predict future events.
Topics covered will include: supervised learning, semi-supervised and active learning, unsupervised learning, kernel methods, probabilistic graphical models, classifier combination, neural networks.
This subject is intended to introduce graduate students to machine learning though a mixture of theoretical methods and hands-on practical experience in applying those methods to real-world problems.
Topics covered will include: linear models, support vector machines, random forests, AdaBoost, stacking, query-by-committee, multiview learning, deep neural networks, un/directed probabilistic graphical models (Bayes nets and Markov random fields), hidden Markov models, principal components analysis, kernel methods.
- Advanced Database Systems 12.5 pts
Many applications require access to very large amounts of data. These applications often require reliability (data must not be lost even in the presence of hardware failures), and the ability to retrieve and process the data very efficiently.
The subject will cover the technologies used in advanced database systems. Topics covered will include: transactions, including concurrency, reliability (the ACID properties) and performance; and indexing of both structured and unstructured data. The subject will also cover additional topics such as: uncertain data; Xquery; the Semantic Web and the Resource Description Framework; dataspaces and data provenance; datacentres; and data archiving.
- Introduction to High Performance Database Systems
- Issues of Performance and Reliability
- Transaction Processing
- Recovery from Failures
- Map Reduce Models.
- Data Science Project Pt1 12.5 pts
This capstone project will provide the culmination of the Master of Data Science degree. It will apply the skills developed during the degree to a practical problem of relevance to science, industry, commerce or society in general. Students will work in teams under only general guidance from staff members. Students will complete diaries to log their work on the project so that the extent of their contribution to group projects can be determined. In the first part of the project students will complete a literature review and a plan for their project.
- Data Science Project Pt2 12.5 pts
This capstone project will provide the culmination of the Master of Data Science degree. It will apply the skills developed during the degree to a practical problem of relevance to science, industry, commerce or society in general. Students will continue to work in their teams established in MAST90106 Data Science Project Part 1, again under only general guidance from staff members. They are expected to present technically correct results in a fashion acceptable to industry-based and other clients.
Students will be continue to be expected to complete diaries to log their work on the project so that the extent of their contribution to group projects can be determined. In this part of the project students will complete the project and present a group project report and oral presentation.
Discipline elective subjects
- Foundations of Spatial Information 12.5 pts
This is an introductory subject to Geograhpic Information Systems (GIS) and Geographic Information Science, both practically and theoretically, at postgraduate level. Spatial information is ubiquitous in decision making. Be it in urban planning, in traffic or disaster management, in way-finding, in issues of the environment, public health and sustainability, or in economic contexts: the question of 'where' is a fundamental one. Spatial information is also special in many respects, such as its dimensionality and autocorrelation, its volume, its links to the Internet of Things (things are always located somewhere), to social networks (which exist in space and time), to streaming data from sensors everywhere, or to intelligent (location-aware) systems. The subject provides the foundations for more specialized subjects on spatial data management, spatial data analysis and spatial data visualization, and is of particular relevance to people wishing to establish a career in the spatial information industry, the environmental or planning industry. It is also suited for every postgraduate student who is looking for solid GIS skills.
We will discuss representations and analysis of this information in spatial information technologies, from location-based services to geographic information systems. Topics addressed are observing the environment; spatial and spatiotemporal data representations, spatial analysis and spatial communication. The practical part will introduce to GIS in a hands-on manner, starting in individual software training and then applying new skills in a team-designed GIS project.
- Spatial Databases 12.5 pts
Spatial databases are fundamental to any geographical information system. Efficient and effective representation and retrieval of spatial information is a non-trivial task. This subject will cover the concepts, methods, and approaches that allow for efficient representation, querying, and retrieval of spatial data.
This subject builds on a student’s knowledge of computer programming, databases, and spatial information. Students who successfully complete this subject may find professional employment in designing, implementing, customising and maintaining databases for the increasingly wide range of spatial software applications.
Fundamentals of spatial databases; spatial data modelling in relational databases, including vector, raster, and network data; spatial operations, including geometric, topological, set-oriented, and network operations; spatial indexes and access methods, including quadtrees and R-trees.
- Spatial Analysis 12.5 pts
In this subject students will learn about the foundations of spatial data and their analysis. Emphasis will be placed on learning how to investigate the patterns that arise as a result of processes that may be operating in space. For example, students will learn to identify geographic clusters of disease cases, or hotspots of crime. A variety of scientific tools including probability theory, combinatorics, descriptive statistics, distributions and matrix algebra will be taught. Students will learn essential skills that are fundamental for all applications of geographic information.
The subject partners with other subjects on spatial data management and visualization, and is of particular relevance to people wishing to establish a career in the spatial information industry, the environmental or planning industry. Spatial Analysis builds on the fundamental knowledge of probability and statistics, mathematics, as well as computer literacy to write simple algorithms, and the preparation and management of data for sophisticated analysis software.
Spatial autocorrelation, spatial data structures and algorithms, point patterns, measures of dispersion, measures of arrangements, line and network analysis, patterns of areas and in fields, and the role of spatial scale and spatial aggregation problems.
- Information Visualisation 12.5 pts
Information Visualisation is about using and designing effective mechanisms for presenting and exploring the patterns embedded in large and complex data sets, and to support decision making. Information Visualisation is important in a range of domains dealing with voluminous data rich in structure, among them, prominently, data in the spatial domain or data referenced to the spatial domain. Through its focus on presentation and interaction with spatial information, this subject complements related subjects that deal with the storage and querying of data (database subjects such as GEOM90018 Spatial Databases), and the processing of data (data analytics subjects such as GEOM90006 Spatial Analysis). This subject is vital for anyone wishing to work with large datasets. It will also be of relevance to those with an interest in design, especially graphical and interaction design.
Fundamentals of information visualisation and data graphics; human perception; foundations of graphical user interface design; cartographic design; geovisualisation; exploratory visual spatial data analysis; evaluation of information visualisation interfaces.
- Analysis of High-Dimensional Data 12.5 pts
Modern data sets are growing in size and complexity due to the astonishing development of data acquisition and storage capabilities. This subject focuses on developing rigorous statistical learning methods that are needed to extract relevant features from large data sets, assess the reliability of the selected features, and obtain accurate inferences and predictions. This subject covers recent methodological developments in this area such as inference for high-dimensional inference regression, empirical Bayes methods, model selection and model combining methods, and post-selection inference methods.
- Advanced Statistical Modelling 12.5 pts
Complex data consisting of dependent measurements collected at different times and locations are increasingly important in a wide range of disciplines, including environmental sciences, biomedical sciences, engineering and economics. This subject will introduce you to advanced statistical methods and probability models that have been developed to address complex data structures, such as functional data, geo-statistical data, lattice data, and point process data. A unifying theme of this subject will be the development of inference, classification and prediction methods able to cope with the dependencies that often arise in these data.
- Mathematics of Risk 12.5 pts
Mathematical modelling of various types of risk has become an important component of the modern financial industry. The subject discusses the key aspects of the mathematics of market risk. Main concepts include loss distributions, risk and dependence measures, copulas, risk aggregation and allocation principles, elements of extreme value theory. The main theme is the need to satisfactorily address extreme outcomes and the dependence of key risk drivers.
- Optimisation for Industry 12.5 pts
The use of mathematical optimisation is widespread in business, where it is a key analytical tool for managing and planning business operations. It is also required in many industrial processes and is useful to government and community organizations. This subject will expose students to operations research techniques as used in industry. A heavy emphasis will be placed on the modelling process that turns an industrial problem into a mathematical formulation. The focus will then be on how to solve the resulting mathematical problem with mixed-integer programming techniques.
- Practice of Statistics & Data Science 12.5 pts
This subject builds on methods and techniques learned in theoretical subjects by studying the application of statistics in real contexts. Emphasis is on the skills needed for a practising statistician, including the development of mature statistical thinking, organizing the structure of a statistical problem, the contribution to the design of research from a statistical point of view, measurement issues and data processing. The subject deals with thinking about data in a broad context, and skills required in statistical consulting.
- Stochastic Calculus with Applications 12.5 pts
This subject provides an introduction to stochastic calculus and mathematics of financial derivatives. Stochastic calculus is essentially a theory of integration of a stochastic process with respect to another stochastic process, created for situations where conventional integration will not be possible. Apart from being an interesting and deep mathematical theory, stochastic calculus has been used with great success in numerous application areas, from engineering and control theory to mathematical biology, theory of cognition and financial mathematics.
- Advanced Probability 12.5 pts
This subject explores a range of key concepts in modern Probability Theory that are fundamental for Mathematical Statistics and are widely used in other applications. We study measurable space, product measure, Fubini's theorem, conditional expectation and conditional probability, construction of i.i.d. and beyond, discrete-time martingales.
- Random Processes 12.5 pts
The subject covers some key aspects of the theory of stochastic processes that plays a central role in modern probability and has numerous applications in natural sciences and industry. We discuss the following topics: ways to construct and specify random processes, functional central limit theorem, Levy processes, renewal processes and Markov processes (discrete and continuous state space). Applications to modelling random phenomena evolving in time are discussed throughout the course.
- AI Planning for Autonomy 12.5 pts
The key focus of this subject is the foundations of autonomous agents that reason about action, applying techniques such as automated planning, reinforcement learning, game theory, and their real-world applications. Autonomous agents are active entities that perceive their environment, reason, plan and execute appropriate actions to achieve their goals, in service of their users (the real world, human beings, or other agents). The subject focuses on the foundations that enable agents to reason autonomously about goals & rewards, perception, actions, strategy, and the knowledge of other agents during collaborative task execution, and the ethical impacts of agents with this ability.
The programming language used in this subject is Python. No lectures or workshops on Python will be delivered.
Topics are drawn from the field of advanced artificial intelligence including:
- Search algorithms and heuristic functions
- Classical (AI) planning
- Markov Decision Processes
- Reinforcement learning
- Game theory
- Ethics in AI planning
- Advanced Theoretical Computer Science 12.5 pts
At the heart of theoretical computer science are questions of both philosophical and practical importance. What does it mean for a problem to be solvable by computer? What are the limits of computability? Which types of problems can be solved efficiently? What are our options in the face of intractability? This subject covers such questions in the content of a wide-ranging exploration of the nexus between logic, complexity and algorithms, and examines many important (and sometimes surprising) results about the nature of computing.
- Turing machines
- The Church-Turing Thesis
- Decidable languages
- Time Complexity: The classes P and NP, NP-complete problems
- Space complexity: including sub-linear space
- Circuit complexity
- Approximation algorithms
- Probabilistic complexity classes
- Additional topics may include descriptive complexity, interactive proofs, communication complexity, complexity as applied to cryptography
- Space complexity, including sub-linear space
- Finite state automata, pushdown automata, regular languages, context-free languages to the Recommended Background Knowledge.
Example of assignment
- Proving the equivalence of a variant of a standard machine to the original version
- Describing an NP-hardness reduction
- Designing an approximation algorithm for an NP-hard problem.
- Algorithms for Bioinformatics 12.5 pts
Technological advances in obtaining high throughput data have stimulated the development of new computational approaches to bioinformatics. This subject will cover core computational challenges in analysing bioinformatics data. We cover important algorithmic approaches and data structures used in solving these problems, and the challenges that arise as these problems increase in scale.
The subject is a core subject in the MSc (Bioinformatics) and is an elective in the Master of Information Technology and the Master of Engineering. It can also be taken by PhD students and by undergraduate students, subject to the approval of the lecturer.
The subject covers key algorithms used in bioinformatics, with a focus on genomics. Indicative topics are: sequence alignment (dynamic algorithms and seed-and-extend), genome assembly, variant detection, phylogenetic reconstruction, genomic intervals, complexity and correctness of algorithms, clustering and classification of genomics data, data reduction and visualisation.
The subject assumes you have experience in programming and familiarity with the foundations of genomics.
- Computational Genomics 12.5 pts
The study of genomics is on the forefront of biology. Current laboratory technologies generate huge amounts of data. Computational analysis is necessary to make sense of these data. This subject covers a broad range of approaches to the computational analysis of genomic data. Students learn the theory behind the different approaches to genomic analysis, preparing them to use existing methods appropriately and positioning them to develop new ways to analyse genomic data.
The subject is a core subject in the MSc (Bioinformatics), and is an elective in the Master of Information Technology and the Master of Engineering. It can also be taken by PhD students and by undergraduate students, subject to the approval of the lecturer.
This subject covers computational analysis of genomic data, from the perspective of information theory. Topics include information theoretic analysis of genomic sequences; sequence comparison, including heuristic approaches and multiple sequence alignment; and approaches to motif finding and genome annotation, including probabilistic modelling and visualization, computational detection of RNA families, and current challenges in protein structure determination. Practical work includes writing bioinformatics applications programs and preparing a research report that uses existing bioinformatics web resources.
- Constraint Programming 12.5 pts
The aims for this subject is for students to develop an understanding of approaches to solving combinatorial optimization problems with computers, and to be able to demonstrate proficiency in modelling and solving programs using a high-level modelling language, and understanding of different solving technologies. The modelling language used is MiniZinc.
Topics covered will include:
- Modelling with Constraints
- Global constraints
- Multiple Modelling
- Model Debugging
- Scheduling and Packing
- Finite domain constraint solving
- Mixed Integer Programming
- Cryptography and Security 12.5 pts
The subject will explore foundational knowledge in the area of cryptography and information security. The overall aim is to gain an understanding of fundamental cryptographic concepts like encryption and signatures and use it to build and analyse security in computers, communications and networks. This subject covers fundamental concepts in information security on the basis of methods of modern cryptography, including encryption, signatures and hash functions.
This subject is an elective subject in the Master of Engineering (Software). It can also be taken as an advanced elective in Master of Information Technology.
The subject will be made up of three parts:
- Cryptography: the essentials of public and private key cryptography, stream ciphers, digital signatures and cryptographic hash functions
- Access Control: the essential elements of authentication and authorization; and
- Secure Protocols; which are obtained through cryptographic techniques.
A particular emphasis will be placed on real-life protocols such as Secure Socket Layer (SSL) and Kerberos.
Topics drawn from:
- Symmetric key crypto systems
- Public key cryptosystems
- Hash functions
- Secret sharing
- Key Management.
- Declarative Programming 12.5 pts
Declarative programming languages provide elegant and powerful programming paradigms which every programmer should know. This subject presents declarative programming languages and techniques.
- The dangers of destructive update
- Functional programming
- Strong type systems
- Parametric polymorphism
- Algebraic types
- Type classes
- Defensive programming practice
- Higher order programming
- Currying and partial application
- Lazy evaluation
- Logic programming
- Unification and resolution
- Nondeterminism, search, and backtracking
- Distributed Algorithms 12.5 pts
The Internet, World Wide Web, bank networks, mobile phone networks and many others are examples for Distributed Systems. Distributed Systems rely on a key set of algorithms and data structures to run efficiently and effectively. In this subject, we learn these key algorithms that professionals work with while dealing with various systems. Clock synchronization, leader election, mutual exclusion, and replication are just a few areas were multiple well known algorithms were developed during the evolution of the Distributed Computing paradigm.
Topics covered include:
- Synchronous and asynchronous network algorithms that address resource allocation, communication
- Consensus among distributed processes
- Distributed data structures
- Data consistency
- Deadlock detection
- Lader election, and
- Global snapshots issues.
- Distributed Systems 12.5 pts
The subject aims to provide an understanding of the principles on which the Web, Email, DNS and other interesting distributed systems are based. Questions concerning distributed architecture, concepts and design; and how these meet the demands of contemporary distributed applications will be addressed.
Topics covered include: characterization of distributed systems, system models, interprocess communication, remote invocation, indirect communication, operating system support, distributed objects and components, web services, security, distributed file systems, and name services.
- Internet Technologies 12.5 pts
The subject will introduce the basics of computer networks to students through a study of layered models of computer networks and applications. The first half of the subject deals with data communication protocols in the lower layers of OSI and TCP/IP reference models. The students will be exposed to the working of various fundamental networking technologies such as wireless, LAN, RFID and sensor networks. The second half of the subject deals with the upper layers of the TCP/IP reference model through a study of several Internet applications.
Topics covered include: Introduction to Internet, OSI reference model layers, protocols and services, data transmission basics, interface standards, network topologies, data link protocols, message routing, LANs, WANs, TCP/IP suite, detailed study of common network applications (e.g., email, news, FTP, Web), network management, and current and future developments in network hardware and protocols.
- Mobile Computing Systems Programming 12.5 pts
Mobile devices are ubiquitous nowadays. Mobile computing encompasses technologies, devices and software that enable (wireless) access to services anyplace, anytime, and anywhere. This subject will cover fundamental mobile computing techniques and technologies, and explain challenges that are unique to the design, implementation, and evaluation of mobile computing. In particular, this subject will enable students to develop mobile phone applications that take advantage of the unique sensing capabilities of mobile devices, their multi-modal interaction capabilities, and their ability to sense and respond to context.
- Parallel and Multicore Computing 12.5 pts
The subject aims to introduce students to parallel algorithms and their analysis. Fundamental principles of parallel computing are discussed. Various parallel architectures and programming platforms are introduced. Parallel algorithms for different architectures, as well as parallel algorithms addressing specific scientific problems are critically analysed.
Topics include: principles of parallel computing, PRAM model, PRAM algorithms, parallel architectures, OpenMP, shared memory algorithms, systolic algorithms, parallel communication patterns, PVM/MPI, scientific applications, hypercube, graph embeddings and extended parallel computing models.
- Programming Language Implementation 12.5 pts
Good craftsmen know their tools, and compilers are amongst the most important tools that programmers use. There are many ways in which familiarity with compilers helps programmers. For example, knowledge of semantic analysis helps programmers understand error messages, and knowledge of code generation techniques helps programmers debug problems at assembly language level. The technologies used in compiler development are also useful when implementing other kinds of programs. The concepts and tools used in the analysis phases of a compiler are useful for any program whose input has a structure that is non-trivial to recognize, while those used in the synthesis phases are useful for any program that generates commands for another system. This subject provides an understanding of the main principles of programming language implementation, as well as first hand experience of the application of those principles.
The subject describes how compilers analyse source programs, how they translate them to target programs, and what tools are available to support these tasks. Topics covered include compiler structures; lexical analysis; syntax analysis; semantic analysis; intermediate representations of programs; code generation; and optimisation.
- Natural Language Processing 12.5 pts
Much of the world's knowledge is stored in the form of text, and accordingly, understanding and harnessing knowledge from text are key challenges. In this subject, students will learn computational methods for working with text, in the form of natural language understanding, and language generation. Students will develop an understanding of the main algorithms used in natural language processing, for use in a diverse range of applications including machine translation, text mining, sentiment analysis, and question answering. The programming language used is Python.
Topics covered may include:
- Text classification and unsupervised topic discovery
- Vector space models for natural language semantics
- Structured prediction for tagging
- Syntax models for parsing of sentences and documents
- N-gram language modelling
- Automatic translation, and multilingual methods
- Relation extraction and coreference resolution
- Stream Computing and Applications 12.5 pts
With exponential growth in data generated from sensor data streams, search engines, spam filters, medical services, online analysis of financial data streams, and so forth, there is demand for fast monitoring and storage of huge amounts of data in real-time. Traditional technologies were not aimed to such fast streams of data. Usually they required data to be stored and indexed before it could be processed.
Stream computing was created to tackle those problems that require processing and classification of continuous, high volume of data streams. It is highly used on applications such as Twitter, Facebook, High Frequency Trading and so forth.
This subject will focus on the algorithms and data structures behind the analysis and management of streams. Theoretical underpinnings are emphasized, with implementation of some fundamental algorithms.
- Why stream processing is important
- Hash functions, probability, and fundamental data structures
- Data stream model
- Data stream algorithms: Sampling, sketching, distinct items, frequent items, frequency moments, etc.
- Data stream mining: clustering, histograms, query tracking
- Graph streams: connectivity, matchings, covers
- Knowledge Management Systems 12.5 pts
This subject focuses on how Knowledge Management (KM) and a range of Information Technologies and analysis techniques are used to support KM initiatives in organisations. Technologies likely to be considered are: collaborative and social media tools; corporate knowledge directories; data warehouses and other repositories of organizational memory; business intelligence including data-mining; process automation; workflow and document management. The emphasis is on high-level decision-making and the rationale of technology-based initiatives and their impact on organizational knowledge and its use. This subject supports course-level objectives by allowing students to develop analytical skills to understand the complexity of real-world KM work in organisations. It promotes innovative thinking around the deployment of existing and emerging information technologies for KM. The subject contributes to the development of independent critical inquiry, analysis and reflection.
Techniques of analysis and design likely to be learned are: critical thinking, discourse analysis and design thinking. Real-world case studies in the form of fieldwork are conducted likely from the following domains: software industry; retail; creative/fashion industry; manufacturing; emergency management. Real case-study work will shape thinking about IT support for KM in these industries.
- Data Warehousing 12.5 pts
Data warehouses are designed to provide organisations with an integrated set of high quality data to support decision-makers. They should support flexible and multi-dimensional retrieval and analysis of data. Topics covered include data warehousing and decision-making, data warehouse design, data warehouse implementation, data sourcing and data quality, on-line analytical processing (OLAP) and data mining, customer relationship management systems, and case studies of data warehousing practice. This subject is part of the Business Analytics stream within the Master of Information Systems.
Students who have a weighted average mark of at least 75% in the Master of Information Systems have the option to complete the on-line Advanced Elective ISYS90094 Business Analysis and Decision Making instead of ISYS90086 Data Warehousing.
This subject introduces the compelling need for data warehousing, data warehouse architectures, decision making, data warehouse design, data warehouse modelling, data quality, data warehouse implementation - including the Extract Transform Load (ETL) process, and data warehouse use in supporting decision making – including decision making tools and OLAP. Readings are provided for all topics that introduce real world cases on data warehousing and related areas and include the use of data warehousing for competitive advantage, success and failure stories in Data Warehousing.
Professional skills subjects
Take no more than 25 points of the following subjects.
- Science Communication 12.5 pts
Why is it essential that scientists learn to communicate effectively to a variety of audiences? What makes for engaging communication when it comes to science? How does the style of communication need to change for different audiences? What are the nuts and bolts of good science writing? What are the characteristics of effective public speaking?
Weekly seminars and tutorials will consider the important role science and technology plays in twenty-first century society and explore why it is vital that scientists learn to articulate their ideas to a variety of audiences in an effective and engaging manner. These audiences may include school students, agencies that fund research, the media, government, industry, and the broader public. Other topics include the philosophy of science communication, talking about science on the radio, effective public speaking, writing press releases and science feature articles, science performance, communicating science on the web and how science is reported in the media.
Students will develop skills in evaluating examples of science and technology communication to identify those that are most effective and engaging. Students will also be given multiple opportunities to receive feedback and improve their own written and oral communication skills.
Students will work in small teams on team projects to further the communication skills developed during the seminar programme. These projects will focus on communicating a given scientific topic to a particular audience using spoken, visual, written or web-based communication.
- Communication for Research Scientists 12.5 pts
As a scientist, it is not only important to be able to experiment, research and discover, it is also vital that you can communicate your research effectively in a variety of ways. Even the most brilliant research is wasted if no one knows it has been done or if your target audience is unable to understand it.
In this subject you will develop your written and oral communication skills to ensure that you communicate your science as effectively as possible. We will cover effective science writing and oral presentations across a number of formats: writing a thesis; preparing, submitting and publishing journal papers; searching for, evaluating and citing appropriate references; peer review, making the most of conferences; applying for grants and jobs; and using social media to publicise your research.
You will have multiple opportunities to practice, receive feedback and improve both your oral and written communication skills.
Please note: students must be undertaking their own research in order to enrol in this subject.
- Science in Schools 12.5 pts
This subject will provide an understanding of your university studies within Victorian schools through a substantial school based experience.
The subject includes a placement of up to 20 hours within a Victorian school classroom, offering an opportunity to collaborate as a Tertiary Student Assistant (TSA) under the guidance of a qualified teacher.
- Science and Technology Internship 12.5 pts
This subject involves completion of an 80-100 hour science or technology work placement integrating academic learning in science areas of study, employability skills and attributes and an improved knowledge of science and technology organisations, workplace culture and career pathways. The placement is supplemented by pre- and post-placement classes designed to develop an understanding of science and technology professions, introduce skills for developing, identifying and articulating employability skills and attributes and linking them to employer requirements in the science and technology domains. Work conducted during the placement will be suitable for a graduate level of expertise and experience. While immersed in a work environment, students will be expected to challenge themselves by accepting roles and responsibilities that stretch their existing capabilities. They will interrogate the requirements of specific careers and continually monitor their own progress towards developing the necessary knowledge, skills and attributes to thrive in these roles.
Students will be responsible for identifying a suitable work placement prior to the semester, with support of the Subject Coordinator. In the semester prior to your placement you should attend Careers & Employment (C&E) employment preparation seminars and workshops as well as accessing other C&E resources to assist you in identifying potential host organisations http://careers.unimelb.edu.au . You should commence your approaches to organisations at least 4 weeks before the placement. More information is available on the subject webpage here: https://science.unimelb.edu.au/students/plan-your-study/internship-subjects. If you have problems finding a placement you should contact the Careers and Industry team in the Faculty of Science at email@example.com.
On completion of the subject, students will have completed and reported on a course-related project in a science or technology workplace. They will also have enhanced employability skills including communication, interpersonal, analytical and problem-solving, organisational and time-management, and an understanding of career planning and professional development.
Data Science Research Project
High-achieving students can choose to undertake a 25-point individual research project.
- Data Science Research Project Pt1 12.5 pts
In this subject, students undertake a substantial research program in the area of Data Science. The research will be conducted under the supervision of a member of the School of Mathematics and Statistics or the Computing and Information Systems academic staff. The results will be reported in the form of a thesis and an oral presentation.
- Data Science Research Project Pt2 12.5 pts
In this subject, students undertake a substantial research program in the area of Data Science. The research will be conducted under the supervision of a member of the School of Mathematics and Statistics or the Computing and Information Systems academic staff. The results will be reported in the form of a thesis and an oral presentation.