-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best regressions? #10
Comments
Cool! Is there a way to evaluate the difference between using the different years for change_since, and how much choosing different years affects the results? |
If you run I found that just the years 2004-2006 give an adjusted r squared of .34 on their own. Other three year intervals are quite a lot worse, so there's probably something in it. Kudos to @malteserteresa for suggesting that interval. Going backwards is a lot better for most pairs, I suspect because I normalise by dividing the difference betweeen population fractions by the population fraction of the first year given. By dividing by the later year I think I'm encoding some more of the information about the relative populations of that later year. Which means I'm getting better information on the population makeup during the year that's closest to the vote. |
Heatmap of the output of exploreUK(): White means the model didn't converge, darker colours are better. Can clearly see that backwards is better and that change in immigration before 2003 is pretty much useless. Regression model features only change_since(since, till) and 2015 white british population. Code to reproduce:
|
Would your conclusions be that only post 2003 change is influential? And that change in Asian and Black population representation was influential where this spiked within short time periods (if the significance appears and reappears as we adjust the time window)? Just trying to get my head around the outputs (I haven't come across this type of analysis before) |
I'd say the interesting things about these heatmaps are:
I can't say anything about Asian or Black population change from those heatmaps, but (log( 1 / Asian pvalue) is correlated with rsq as shown below: For Black population: And here are some heatmaps showing heatmaps for the Asian data. First one is coloured by the log of (1 / p value). log(1/ 0.05) == 2.99. Next is binary on pvalue['Asian'] < 0.05: |
Best regressions I've found so far for England and for the whole of the UK. Years picked for change_since are chosen by a for loop. Details in source, but it's pretty simple.
England benefits from IMD data slightly, but also just from excluding the other areas:
The text was updated successfully, but these errors were encountered: