Skip to content

mgadhvi/lego-sets-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LEGO Sets and Themes Analysis

Project Overview

This project analyzes LEGO sets and their associated themes to determine the most popular themes based on the number of sets available. The analysis is performed using SQL queries to extract relevant data from a database, and the results are visualized using matplotlib.

Table of Contents

  • Technologies Used
  • Database Schema
  • Installation
  • Usage
  • SQL Queries
  • Visualization
  • Contributing
  • License

Technologies Used

  • Python
  • SQLite (or any other SQL database)
  • Pandas (for data analysis and manipulation)
  • Matplotlib (for data visualization)

Database Schema

create table if not exists themes (
  id smallint primary key,
  name varchar(64),
  parent_id smallint
);

create table if not exists colors (
  id smallint primary key,
  name varchar(64),
  rgb varchar(6),
  is_trans varchar(1)
);

create table if not exists part_categories (
  id smallint primary key,
  name varchar(64)
);

create table if not exists parts (
  part_num varchar(16) primary key,
  name varchar(255),
  part_cat_id smallint,
  part_material_id smallint,
  foreign key(part_cat_id) references part_categories(id)
);

create table if not exists part_relationships (
  rel_type varchar(1),
  child_part_num varchar(20),
  parent_part_num varchar(20)
);

create table if not exists elements (
  element_id varchar(16) primary key,
  part_num varchar(16),
  color_id smallint,
  foreign key(part_num) references parts(part_num),
  foreign key(color_id) references colors(id)
);

create table if not exists minifigs (
  fig_num varchar(20) primary key,
  name varchar(256),
  num_parts smallint
);

create table if not exists inventories (
  id int primary key,
  version smallint,
  set_num varchar(16),
  foreign key(set_num) references sets(set_num)
);

create table if not exists sets (
  set_num varchar(16) primary key,
  name varchar(128),
  year smallint,
  theme_id smallint,
  num_parts int,
  img_url varchar(128),
  foreign key(theme_id) references themes(id)
);

create table if not exists inventory_parts (
  inventory_id int,
  part_num varchar(16),
  color_id smallint,
  quantity smallint,
  is_spare varchar(1),
  foreign key(inventory_id) references inventories(id),
  foreign key(color_id) references colors(id)
);

create table if not exists inventory_sets (
  inventory_id int,
  set_num varchar(16),
  quantity smallint,
  foreign key(inventory_id) references inventories(id),
  foreign key(set_num) references sets(set_num)
);

create table if not exists inventory_minifigs (
  inventory_id int,
  fig_num varchar(20),
  quantity smallint,
  foreign key(inventory_id) references inventories(id),
  foreign key(fig_num) references minifigs(set_num)
);

Installation

  1. Clone the repository:
    git clone https://github.com/mgadhvi/lego-sets-analysis.git
    cd lego-sets-analysis
        
  2. Install the required Python packages from the requirements.txt file:
    pip install -r requirements.txt
        
  3. Set up your SQLite database with the lego_sets and lego_themes tables from here

Usage

  1. Update the database connection string in the Python script to point to your SQLite database file.
  2. Run the Python script to execute the SQL queries and generate the pie chart:
    python analyze_lego_themes.py
        

SQL Queries

The main SQL query used to retrieve the top 10 most popular LEGO themes is as follows:

SELECT theme_name, COUNT(set_id) AS set_count
FROM lego_themes
LEFT JOIN lego_sets ON lego_themes.theme_id = lego_sets.theme_id
GROUP BY theme_name
ORDER BY set_count DESC
LIMIT 10;

This query counts the number of sets for each theme and orders the results to find the top 10 themes.

Visualization

The results are visualized using a pie chart created with Matplotlib. The pie chart displays the distribution of the top 10 LEGO themes based on the number of sets.

Contributing

Contributions are welcome! If you have suggestions for improvements or new features, please open an issue or submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

An analysis of Lego sets stored in a PostgreSQL database

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published