IT 세계의 후아

[kaggle]Ubiquant Market Prediction 본문

Coding/Study

[kaggle]Ubiquant Market Prediction

후__아 2024. 5. 1. 23:26

금융 데이터를 다뤄보고 싶던 중 Kaggle에서 이미 완료된 대회를 발견했다..!

친절하게도 beginner(나같이 다 까먹은 전공자도,,^^)를 위한 코드와 설명이 돼있어서 복습할 겸 수행해보았다.

 

https://www.kaggle.com/code/miingkang/ml-from-the-beginning-to-the-end-for-newbies/notebook

 

ML from the beginning to the end (For newbies🐢)

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

www.kaggle.com

 

First. Big Picture -🏔

To attempt to predict returns, there are many computer-based algorithms and models for financial market trading. Yet, with new techniques and approaches, data science could improve quantitative researchers' ability to forecast an investment's return.

Ubiquant is committed to creating long-term stable returns for investors.

In this competition, you’ll build a model that forecasts an investment's return rate. Train and test your algorithm on historical prices. Top entries will solve this real-world data science problem with as much accuracy as possible.

 

Second. Problem definition -✏

"This dataset contains features derived from real historic data from thousands of investments."

Your challenge is to predict the value of an obfuscated metric relevant for making trading decisions.

  • row_id - A unique identifier for the row.
  • time_id - The ID code for the time the data was gathered. The time IDs are in order, but the real time between the time IDs is not constant and will likely be shorter for the final private test set than in the training set.
  • investment_id - The ID code for an investment. Not all investment have data in all time IDs.
  • target - The target.
  • [f_0:f_299] - Anonymized features generated from market data.

Performance metrics is the mean of the Pearson correlation coefficient.

 

Third. Data & Import