- What is Scikit-learn?
- How is Scikit-learn different from other ML libraries like TensorFlow or Keras?
- What are the key features of Scikit-learn?
- Which programming language is Scikit-learn based on?
- How do you install Scikit-learn?
- What are the main modules available in Scikit-learn?
- What types of models are supported in Scikit-learn?
- What is a pipeline in Scikit-learn?
- How do you load datasets using Scikit-learn?
- How do you check the version of Scikit-learn?
- What is StandardScaler in Scikit-learn?
- How does MinMaxScaler work?
- Explain how OneHotEncoder is used.
- What is the purpose of LabelEncoder?
- How do you handle missing data in Scikit-learn?
- What is SimpleImputer and how is it used?
- What is the difference between fit(), transform(), and fit_transform()?
- How does ColumnTransformer work?
- What’s the purpose of FunctionTransformer?
- How do you normalize data in Scikit-learn?
- How do you split a dataset using Scikit-learn?
- What is train_test_split()?
- What is cross-validation and how is it implemented in Scikit-learn?
- What is StratifiedKFold?
- What are scoring metrics available in Scikit-learn?
- Explain confusion_matrix.
- What is classification_report()?
- How do you use roc_auc_score()?
- What is mean_squared_error() used for?
- Explain how you would compare models in Scikit-learn.
- How do you implement logistic regression in Scikit-learn?
- How is decision tree classification performed in Scikit-learn?
- What is the use of KNeighborsClassifier?
- How do you train a random forest classifier?
- What is SGDClassifier?
- What hyperparameters can you tune in a classifier?
- How do you evaluate multi-class classification models?
- What is predict_proba()?
- How does Naive Bayes work in Scikit-learn?
- How do you handle imbalanced datasets?
- What is linear regression in Scikit-learn?
- What’s the difference between LinearRegression and Ridge?
- What is Lasso regression?
- How do you implement polynomial regression?
- How is decision tree regression done?
- What is mean_absolute_error()?
- How do you tune hyperparameters in regression?
- What is R² score?
- How do you check residuals in Scikit-learn?
- What is ElasticNet regression?
- How do you perform K-Means clustering?
- What is DBSCAN in Scikit-learn?
- What is Agglomerative Clustering?
- How do you evaluate clustering results?
- What is the silhouette score?
- How does KMeans.predict() work?
- Can you use pipelines for clustering?
- How do you visualize clusters in Scikit-learn?
- What is MiniBatchKMeans?
- What are the limitations of K-Means?
- What is PCA and how is it implemented?
- How do you determine the number of components in PCA?
- What is TruncatedSVD used for?
- How is SelectKBest used for feature selection?
- What is mutual information in feature selection?
- What is VarianceThreshold?
- How do you remove highly correlated features?
- What is feature importance and how is it used?
- What is recursive feature elimination (RFE)?
- How do you handle multicollinearity?
- What is GridSearchCV?
- What is RandomizedSearchCV?
- How do you perform hyperparameter tuning?
- What is the difference between GridSearchCV and RandomizedSearchCV?
- What is Pipeline and how does it help in tuning?
- How can you evaluate different models in a loop?
- What is make_pipeline()?
- How do you avoid overfitting?
- What is cross_val_score()?
- What is check_estimator()?
- How do you build custom transformers?
- What is a meta-estimator?
- How do you ensemble models in Scikit-learn?
- What is VotingClassifier?
- What is BaggingClassifier?
- How do you use StackingClassifier?
- What is Pipeline vs make_pipeline?
- What is the clone() function in Scikit-learn?
- How does Scikit-learn handle categorical features?
- How can you use Scikit-learn with pandas?
- How do you export a trained model?
- What is joblib and how is it used?
- How do you serve Scikit-learn models in a Flask app?
- How do you validate a model before deployment?
- How do you automate pipelines?
- How do you handle real-time predictions?
- How can Scikit-learn be used in production?
- How to monitor model drift?
- Can you use Scikit-learn in Spark or distributed systems?
- How do you document Scikit-learn pipelines?
-
What is the Surprise library in Python?
-
What are the key features of Surprise?
-
What kind of recommendation systems can you build using Surprise?
-
How is Surprise different from Scikit-learn?
-
What are the main components of the Surprise library?
-
How do you install the Surprise library?
-
What data formats does Surprise support?
-
How do you load datasets into Surprise?
-
What is a Dataset object in Surprise?
-
What is a Reader class in Surprise?
-
How do you load a custom dataset in Surprise?
-
What is the significance of the Reader class?
-
What does Dataset.load_from_file() do?
-
What is the role of Dataset.load_builtin()?
-
What is the Dataset.load_from_df() function used for?
-
What are the columns required in a dataset for Surprise?
-
How do you split data into training and testing sets?
-
What does train_test_split() return?
-
How is Dataset.build_full_trainset() used?
-
What is the use of build_testset()?
-
What algorithms are available in Surprise?
-
How does the SVD algorithm work in Surprise?
-
What is KNNBasic in Surprise?
-
How is KNNWithMeans different from KNNBasic?
-
What is KNNWithZScore?
-
What is KNNBaseline?
-
How is the BaselineOnly algorithm used?
-
What is SlopeOne in Surprise?
-
What is CoClustering used for?
-
What is the difference between collaborative and content-based filtering in Surprise?
-
How do you train an SVD model in Surprise?
-
What is the fit() method used for?
-
What are hyperparameters in Surprise models?
-
How do you tune hyperparameters in Surprise?
-
How do you train a model on the entire dataset?
-
What does model.predict() return?
-
How do you predict ratings for a given user-item pair?
-
What does the est attribute in prediction mean?
-
How do you evaluate the model after training?
-
How do you train multiple models on the same dataset?
-
What evaluation metrics are supported by Surprise?
-
What is RMSE and how is it computed?
-
What is MAE in recommendation systems?
-
What is FCP and how is it useful?
-
What is precision@k and recall@k?
-
How do you use accuracy.rmse() in Surprise?
-
What does accuracy.mae() return?
-
How do you evaluate a model using cross-validation?
-
What is cross_validate() used for?
-
What is the difference between evaluate() and cross_validate()?
- How is K-Fold cross-validation implemented in Surprise?
- What is KFold() used for?
- How do you specify the number of folds?
- How is LeaveOneOut() different from KFold?
- What is ShuffleSplit() in Surprise?
- What are the pros and cons of using cross-validation?
- How do you evaluate model performance across folds?
- What is the importance of reproducibility in CV?
- How can you get averaged metrics from CV?
- What is verbose=True used for in cross_validate()?
- What similarity measures are supported by Surprise?
- How do you use cosine similarity in Surprise?
- What is Pearson correlation and how is it used in Surprise?
- What is similarity_options?
- How do you set up item-based vs user-based filtering?
- How do you get nearest neighbors for an item?
- How to find similar users in Surprise?
- How do you visualize similarity matrices?
- How do you extract raw similarity scores?
- Can you build your own similarity function?
- What is GridSearchCV in Surprise?
- How do you define a parameter grid?
- How do you tune the number of latent factors in SVD?
- What does n_factors do in SVD?
- What is n_epochs?
- What does lr_all refer to in tuning?
- How do you balance bias-variance using tuning?
- What is overfitting in recommendation systems?
- How do you use best parameters from tuning?
- What is the output of GridSearchCV.best_score?
- How do you recommend top-N items for a user?
- How do you remove already rated items from recommendations?
- How do you use real-time feedback in Surprise?
- How do you update a trained model with new data?
- Can you integrate Surprise with Flask/Django?
- How can you visualize recommendations?
- How can Surprise be integrated with pandas?
- How do you serialize (save) a Surprise model?
- How do you load a trained model in Surprise?
- What is the dump module in Surprise?
- What kind of datasets are ideal for Surprise?
- How do you evaluate a recommender system in production?
- How do you scale a Surprise-based recommendation engine?
- Can Surprise handle cold-start problems?
- How do you deal with sparsity in Surprise?
- Can you use implicit feedback with Surprise?
- How do you extend Surprise for hybrid recommendation?
- How can you use Surprise for business recommendation engines?
- What are the limitations of Surprise?
- When should you avoid using Surprise?