You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Linear model SGD or averaged perceptron as a benchmark for both dense array input and sparse input (bag of words representation of a text document) would be very nice to have.
The text was updated successfully, but these errors were encountered:
Thank you for highlighting this use case. I would like to contribute to developing the idea of stochastic Gradient Descent for Linear model. Stochastic model is used in deep learning and in machine learning and i would like to develop the idea of where it is being used:
1)Consider you want to calculate the accurate predictions then we use stochastic Gradient , for example: let us take a graph and assume that the steepest slope in the downward direction here we estimate how wrong the model predictions are and the steps are adjusted according to the model's parameters.
2)We have a mathematical formula to calculate the model performance
θ new =θ old−α∇J(θ old )
where the α is the learning rate
∇J(θ old) is the gradient of cost function.
3)Traditional methods are more time consuming because they process the entire dataset and it requires more memory and it leads to redundant computations.
4)Why stochastic Gradient is better then traditional methods? stotastic gradient does not utilize the entire dataset for computation it randomly selects a single data point and calculates the gradient of cost function and it is much faster than the traditional methods.
5)There are some potential challenges :
Due to the Randomness introduced during the selection of points there might be some noise.
The Learning rate is very important a high learning rate leads to divergence and a low learning rate in slow convergence
6)What are the potential contributions:
Develop Learning Rate Schedules.
Instead of training on single points try it on mini batches.
Implement techniques for handling large datasets.
Adding regularization techniques such as L1 and L2.
Try to avoid overfitting and underfitting by maintaining the trade-off bias and variance.
Linear model SGD or averaged perceptron as a benchmark for both dense array input and sparse input (bag of words representation of a text document) would be very nice to have.
The text was updated successfully, but these errors were encountered: