2 new blog posts

grahamstark · Dec 10, 2024 · e3bf5a7 · e3bf5a7
1 parent 7ec23cb
commit e3bf5a7
Show file tree

Hide file tree

Showing 2 changed files with 128 additions and 28 deletions.
diff --git a/docs/blog/_posts/2024-05-08-Capital.md b/docs/blog/_posts/2024-05-08-Capital.md
@@ -4,7 +4,7 @@ date:   2024-22-10
 category: Blog
 tag: Microsimulation
 tag: Data
-title: Capital
+title: Capital in the FRS and WAS 
 author: graham_s
 nav_exclude: true
 ---
@@ -13,9 +13,95 @@ Over the summer I built a microsimulation of Legal Aid for the [Scottish Legal A
 
 My SLAB contact asked a lot of good questions, especially about data. 
 
+One thing that came up is how different capital data is in the [FRS](http://research.dwp.gov.uk/asd/frs/) and [Wealth and Assets Survey](https://beta.ukdataservice.ac.uk/datacatalogue/series/doi/?id=2000056).
+
+I don't really know why this is.
+
+<!--more-->
+
+The FRS docs state:
+
+> Savings and investments: The data relating to savings and investments should be treated with caution. A high proportion of respondents do not know the interest received on their assets and therefore around one in ten cases are imputed. It is thought that there is some under- reporting of capital by respondents, in terms of both the actual values of the assets and the investment income. The FRS does not capture information on non-liquid assets. Therefore property, physical wealth and pensions accruing are not included in estimates of savings and investments. 
+>[^FRS-1]
+
+So it's not likely as accurate as WAS and only includes financial assets, but in practice it's *way* off.
+
+## How Far Off?
+
+This is a comparison on WAS Round 7 Net Financial Wealth with the recorded wealth from FRS.
+
+FRS financial wealth summed `totcapb3` from the `benunit` record, aggregated to household level. The script for this is [wealth.jl in STBScratch repository](https://github.com/grahamstark/StbScratch/).
+
+WAS is `hfinwntr7_sum`. 
+
+Neither is uprated.
+
+## FRS Financial Wealth 
+
+```
+
+Summary Stats:
+Length:         16108
+Missing Count:  0
+Mean:           35_423.728518
+Std. Deviation: 103429.012582
+Minimum:        0.000000
+1st Quartile:   0.000000
+Median:         3_481.669613
+3rd Quartile:   20000.000000
+Maximum:        1109802.991556
+
+```
+
+## WAS Financial Wealth (positives only)
+
+```
+
+julia> summarystats( washf.financial_wealth )
+
+Summary Stats:
+Length:         14306
+Missing Count:  0
+Mean:           105_456.663263
+Std. Deviation: 164508.676369
+Minimum:        1.000000
+1st Quartile:   7000.000000
+Median:         36_130.500000
+3rd Quartile:   128000.000000
+Maximum:        996000.000000
+
+```
+
+So the FRS financial wealth Median is 1/10th the WAS median. 
+
+### Techie Note
+
+This script was my 1st experiment with [Tidier](), Julia's [R]() [Tidyverse]() clone. I'm impressed, though there's less call for this in Julia than in R since julia's loops are so efficient, so all the piping isn't needed in quite the same way.)
+
+Tidier lets you do things like: 
+
+```julia
+
+fhw = @chain buf begin
+    @group_by sernum
+    @filter totcapb3 < 1_000_000
+    @summarise hhwealth = sum(totcapb3)
+end
+
+washf = @chain wash begin
+    @rename financial_wealth=hfinwntr7_sum
+    @filter  financial_wealth > 0 && financial_wealth < 1_000_000 
+end
+
+```
+
+[^FRS-1]: DWP (2019) ‘Family Resources Survey 2019: Background Note and Methodology’. Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/789455/family-resources-survey-2017-18-background-note-methodology.pdf.
+
+
+
+
 
 
-<!--more--
 
 
 

diff --git a/docs/blog/_posts/2024-05-09-Legal-Aid.md b/docs/blog/_posts/2024-05-09-Legal-Aid.md
@@ -4,53 +4,67 @@ date:   2024-01-12
 category: Blog
 tag: Microsimulation
 tag: Conferences
-title: IMA Conference 2024
+title: Modelling Legal Aid
 author: graham_s
 nav_exclude: true
 ---
 
-Just on the way back from the [9th World Congress of the International Microsimulation Association](https://ima-2024.wifo.ac.at/) in Vienna. 
-Northumbria allowed [Howard Reed](http://www.landman-economics.co.uk/about/) and me to go to present our NIHR work.
-
-It was fun. Met some good people, heard some interesting papers, and our presentations went fine, I think.
+A quick write-up of a little microsimulation of Legal Aid that I've written for the Scottish Legal Aid Board (SLAB). It was fun, though I crashed a few deadlines and took some unexpected turns.
 
 <!--more-->
 
-I did two. First was a 1 1/2 hour pre-conference hands on thing introducing microsimulation in Julia, then Howard and I presented the NIHR work.
+## Previous Attempts
+
+I've done a few models of Legal Aid in the past.
+
+In 2001/2, back when I was still at the Institute for Fiscal Studies (IFS), I produced what I'm pretty certain was the first ever microsimulation modelling of Legal Aid in England and Wales, along with the brilliant Alexy Buck of the Legal Services Research Centre[^LA1].
+
+Subsequently, I produced similar models for SLAB and the Northern Ireland Legal Aid Board, including the June 2007 Report  “Modelling Financial Eligibility For Legal Aid”, jointly with the rather less brilliant Tony Dignan. 
+
+For the England and Wales studies, the modelling used a variant of the IFS Tax and Benefit model[^TAXBEN]. The Scottish and Northern Irish work used a custom-build model, written in [Ada](https://learn.adacore.com/) and released under an Open Source Licence on the GitHub code sharing site, where it [remains today](https://github.com/grahamstark/scottish_legal_aid).
+
+The England/Wales work was especially rewarding as using the model we developed a systematic method for simplifying the system,whilst making as few changes to the entitled population as possible. That's important because most tax-benefit simplification proposals - flat taxes, basic incomes, negative income taxes, or whatever - involve huge distributional changes, usually in ways that aren't intended or wanted.
+
+## The Model
+
+Like the English work, but unlike the previous SLAB model, was integrated into ScotBen rather than stand alone. Building on an existing model had several advantages:
+
+* much (but not all – see below) of the hard work of data creation, weighting and uprating is already done;
+* we can also re-use output routines, for example tabulators1;
+* also, SLAB gets a certain amount of policy relevant modelling as a bonus, such as the effects on Legal Aid eligibility of changes to Universal Credit, or of the gradual phasing out of legacy passport benefits such as Income Support.
+
+One issue was that Scotben currently has no correction for non-takeup of means-tested benefits, so it is possible that it would overstate the effects of passporting these benefits on Legal Aid eligibility. 
+
+
+### The Scottish Crime and Justice Survey
+
+The previous models were mainly concened with financial eligibility for legal aid. But to go from eligibility to actual expenditure is much tricker than for, say, eligibility to Universal Credit to actually receiving it. The intial proposal involved using the [Civil legal module]() of the [Scottish Crime And Justice Survey](https://www.gov.scot/news/scottish-crime-justice-survey/) (SCJS) to impute the likelihood of each FRS person experiencing a legal problem that might require going to a solicitor. So I spent the first few weeks of the project doing that. That went well, and produced some interesting results - [probits modelling reporting a civil problem against family type, income, housing and so on](https://github.com/grahamstark/ScottishTaxBenefitModel.jl/blob/master/regressions/civil-problems-scjs.jl). I might try publishing something using those regressions. But this approach got vetoed by SLAB, on the grounds that the problem categories used in SCJS don't match the reporting categories SLAB use internally. The client is not always right, but they are always the client. That threw me quite a bit as I'm old and like sticking to the plan.
 
-## Live Coding A Tax Benefit Model.
+### Problems
 
-I volunteered for this! I'd imagined 4 or 5 people but the room was packed out. I'm just really lucky to be surrounded by the best people. Judith ran through the whole thing the day before, made some really useful notes - the upshot was I was planning on taking them through far too much stuff and should cut down drastically. On the day J and H stopped me panicking on the day about screens, networks and all the things I always panic over, and H kept an eye on things during the presentation and stopped me making some silly mistakes like not saving files. And my son sneaked it - he's doing is International Law masters at the University.
+Apart from the SCJS thing, I thought the project went pretty well, though a few dealines were crashed. My collaborator at SLAB, Kieran Forbes, was really on the ball but supportive. He asked a lot of good questions and has really helped by pushing for better understanding of data - especially capital and expenses, which are pretty central to the Legal Aid calculations. I spent a *lot* of time sorting capital and expenses out - see [this post]() on capital and I'll write up expenses presently.
 
-The material is [on Github](https://github.com/grahamstark/IMAWorkshop/). Some thoughts:
+### Admin Data
 
-* [Pluto](https://plutojl.org/) works really well in this context;
-* A rehearsal really helps, even it you're not planning on sticking to a script;
-* I used that repository to encourage people to pre-load the right Julia version and some packages before the event, but not everyone saw it;
-* you can hide if not actually prevent the [time to first plot (TTFP)](https://blog.glcs.io/julia-1-10#heading-improved-latency-or-getting-started-faster) thing by loading packages incrementally: so just Pluto in the REPL, start that, then just enough to get some data, talk a bit, then plots and whaterver else. TTFP means essentially that Julia buys its very fast runtimes at the expense of slow startup times, and you don't want to have to wait 10 minutes in a live class while packages compile if you can avoid it;
-* I had a real worry with the data because the conditions the Archive put even on publicly available teaching datasets are pretty hard to comply with in a setting where I couldn't know who was turning up. Fortunately I'd been experimenting with [SynthPop](https://synthpop.org.uk/get-started.html) and [SimPop](https://cran.r-project.org/web/packages/simPop/index.html), two R packages for generating synthetic datasets that resemble some true target dataset. Of the two, SynthPop is much the easiest to use though SimPop has useful features like being able to mimic household structures over multiple records. So I took an old UKDS teaching [LCF](https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=6117) subset, aggregated some of the records, ran it through SynthPop and we were good to go. I'm not 100% sure what SynthPop is doing, but the marginal distrubutions are remarkably close to the base data. The [R code is in the repo](https://github.com/grahamstark/IMAWorkshop/blob/main/src/syndata.R) and the synthetic dataset is [here](https://virtual-worlds.scot/ou/uk-lcf-subset-2005-6.csv);
-* the Blue Peter approach of building something and getting everyone to play along works well: better than explaining some long list of language features I think;
-* people had problems with getting Wifi, with having old Julia versions installed and other stuff but it was lovely to see people helping each other out. It was J who reminded me to build in slack for all this.
+SLAB made up for vetoing by SCJS big idea by giving me some really nice admin data. It's fantastic data - case costs, contributions, case types. Unfortunately there's not much demographics so the usual data matching tricks don't work. I settled for matching rough crosstabs of cases and genders against aggregate entitlements, to give rough propensities to claim by case types, entitlement levels and gender.  Unfortunately I can't distribute this because of confidentiality, but I'm considering building a [synthetic dataset](https://docs.sdv.dev/sdv) version.
 
-In `src` there's the [more complicated model I'd planned to use](https://github.com/grahamstark/IMAWorkshop/blob/main/src/pluto-tb-model.jl) and also the [drastically stripped down one I ended up live-coding and walking people through](https://github.com/grahamstark/IMAWorkshop/blob/main/src/pluto-tb-basic.jl). There's a lesson there.
+### Validation
 
-I think people enjoyed it - going slow through a very simple example meant that most people got the model we were building to work, and that's a pretty satisfying feeling for everyone, I hope. 
+The model was [test first](https://www.agilealliance.org/glossary/tdd/) as usual. Mostly the tests replicated the results from the [two]() [online]() Legal Aid Calculators, but later on I added a bunch of tests of the code that imputed claims onto entitlements since that ended up being pretty brittle.
 
-## TriplePC/NIHR
+### Front Ends
 
-On the Tuesday we had our TriplePC presentation. Howard did most of it and I ren though a live model demo. Howard is a very good presenter, the model didn't fall over, and I can wave my arms around, so all in all it went OK. [Here's the presentation](https://virtual-worlds.scot/ou/ima-presentation.pdf). There were 4 presentations in 1 1/2 hours including questions, so it's all pretty compressed. The more presentations I do, the more I want to cut things out and simplify; just get a few messages across.
+The model is not much use without an interface. It was a struggle though. It's a convoluted story but we ended up having to install the model on an i3 laptop with 8mb RAM. Which was a lot when we wrote [Virtual Economy]() but not a lot to run a [Julia Instance]() with a [Web Stack]() in. It turned out pretty well in the end but getting the whole thing to run efficiently was a *lot* of work. I'm proud of the crosstab. And I learned a decent amount about how to write an efficient job quite and session manager. 
 
-## The rest of the conference
+<hr/>
 
-The conference was really well organised: very Austrian. [Martin Spielauer](https://www.wifo.ac.at/en/martin_spielauer), the main organiser, was very accomodating & friendly and the University tech staff were great for us - not always a given.
+[^LA1]: Buck, Alexy, and Graham Stark. ‘Means Assessment: Options for Change’. Legal Services Commission, 2001. http://webarchive.nationalarchives.gov.uk/20100210214359/http://lsrc.org.uk/publications/meansassessmentoptionsforchange.pdf.
 
-I liked a lot of the papers I went to. I've always wanted to get into Agent Based Modelling so I went to a session on that. Interesting, but pretty uncompromising presentations with pages of small-multiple graphs, lots of maths, hard to read text. Slow down! The papers in our session were well presented but sometimes it was hard to see what they were trying to achieve. I went to some straight tax-benefit papers and some labour supply/dynamics things. A recurring theme is that things get a bit off whenever Euromod is involved - remarkable organisation/Grant extraction system, but well dodgy software, super-overconfident researchers. The paper that impressed me most was a relatively straighforward one on [projecting family care in Germany](https://ima-2024.wifo.ac.at/content/abstracts/rebaudo.html). I think my students would like that one.
+———. ‘Simplicity versus Fairness in Means Testing: The Case of Civil Legal Aid’. Fiscal Studies 24, no. 4 (2003): 427–49. https://doi.org/10.1111/j.1475-5890.2003.tb00090.x.
 
-The [Policy Engine](https://policyengine.org) people went all in and staged their own fringe event, with pizza and wine, in a rental office down by Prater. *Lots* to think about there and worth a post of its own.
+[^TAXBEN]: Johnson, P. G., G. K. Stark, and S. J. Webb. ‘TAXBEN 2: The New IFS Tax and Benefit Model’, 1990. https://virtual-worlds.scot/publications/docs/stark-webb-taxben.pdf.
 
-Downside was I came down with some horrible non-covid thing which had everything: nosebleeds, hacking cough, diarrhea.. Adrenaline gets you through doing the presentations but I'm pretty exhausted now & I bunked off straight after our 2nd presentation so missed a lot of socialising and the whole of the final day.
 
-So that's Vienna for a while. I love that place.