Evaluating ML methods for semi-structured data
Description
This project aims to understand the performance of various ML algorithms over semi-structured data, that could alternatively be viewed as data with many missing variables. The student will:
- Explore the real-world data, representing field trials and historical data, aiming to make agricultural liming recommendations for NSW and ACT. The data is highly heterogeneous and sparse.
- Select some well-known and suitable ML methods appropriate for the data.
- Extract data from a Knowledge Graph (KG) and prepare for selected methods.
- Apply and optimise methods to build predictive models for liming recommendations.
- Evaluate models.
- Compare and contrast performance results with novel and developing KG learning methods.
- Provide recommendations based on the evaluation to potentially improve the KG learning methods.
The student will work with a broader project team and will have access to the project's multi-GPU cluster server. The project is available for up to 2 students, thus increasing the range of algorithms under consideration. It is most suitable for a 12-unit course, although other sizes are possible.