About Prompt
- Prompt Type – Dynamic
- Prompt Platform – ChatGPT, Grok, Deepseek, Gemini, Copilot, Midjourney, Meta AI and more
- Niche – Specific AI Niche
- Language – English
- Category – Code Generation
- Prompt Title – Python Code Generation Prompt for Beginners in Data Science
Prompt Details
This prompt is designed to help beginners in data science generate Python code for various tasks. It is dynamic, allowing you to specify the dataset, task, libraries, and desired output format. It adheres to prompt engineering best practices and is adaptable across different AI platforms.
**Prompt Structure:**
“`
I want you to act as a Python code generator for beginner data science projects. I will provide you with the following information:
1. **Dataset Description:** A detailed description of the dataset I’m working with. This should include:
* File format (e.g., CSV, Excel, JSON)
* File path or URL (if publicly accessible)
* Key columns and their data types (e.g., ‘price’: numerical, ‘product_name’: categorical)
* A brief explanation of the dataset’s purpose (e.g., predicting customer churn, analyzing sales trends)
2. **Task Description:** A clear explanation of the data science task I want to perform. Be as specific as possible. Examples include:
* Data cleaning (e.g., handling missing values, removing duplicates)
* Exploratory Data Analysis (EDA) (e.g., generating descriptive statistics, creating visualizations)
* Data preprocessing (e.g., one-hot encoding, feature scaling)
* Model building (e.g., linear regression, decision tree classification)
* Model evaluation (e.g., calculating accuracy, precision, recall)
3. **Desired Libraries:** A list of Python libraries you should use in the generated code. If no preference is specified, use common data science libraries like pandas, NumPy, scikit-learn, matplotlib, and seaborn.
4. **Desired Output Format:** Specify the desired output format. Examples include:
* Python script (.py)
* Jupyter Notebook (.ipynb)
* Code snippet for a specific function or class
* Visualizations (e.g., matplotlib plots, seaborn charts) – Specify the plot type (e.g., scatter plot, histogram, bar chart).
5. **Additional Requirements (Optional):** Any other specific instructions, constraints, or preferences for the code generation. Examples include:
* Code comments explaining each step.
* Specific functions or methods to be used.
* Handling imbalanced datasets.
* Performance optimization considerations.
Once I provide this information, generate the corresponding Python code. Ensure the code is well-commented, readable, and follows best practices for data science.
**Example Prompt:**
I want you to act as a Python code generator for beginner data science projects.
1. **Dataset Description:** I have a CSV file named ‘customer_churn.csv’ located at ‘./data/customer_churn.csv’. It contains information about telecom customers, including their tenure, monthly charges, and whether they churned (yes/no). Key columns include ‘tenure’ (numerical), ‘MonthlyCharges’ (numerical), and ‘Churn’ (categorical). The goal is to predict customer churn based on these features.
2. **Task Description:** I want to perform EDA on this dataset. Specifically, I want to generate descriptive statistics for the numerical columns and create a bar chart showing the churn distribution.
3. **Desired Libraries:** pandas, matplotlib
4. **Desired Output Format:** Jupyter Notebook (.ipynb)
5. **Additional Requirements:** Add comments to the code explaining each step.
**Expected Output (Illustrative):**
“`python
# Import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
# Load the dataset
df = pd.read_csv(‘./data/customer_churn.csv’)
# Display descriptive statistics for numerical columns
print(df.describe())
# Create a bar chart showing churn distribution
churn_counts = df[‘Churn’].value_counts()
churn_counts.plot(kind=’bar’)
plt.title(‘Customer Churn Distribution’)
plt.xlabel(‘Churn’)
plt.ylabel(‘Count’)
plt.show()
“`
“`
This dynamic prompt allows you to tailor your request for a wide range of beginner-level data science tasks, maximizing code generation effectiveness and facilitating learning. Remember to replace the example information with your specific project details.