Ib naming model #50

nathimel · 2024-11-23T04:03:56Z

🥳 We now have an IB-BA implementation native to ULTK and a minimal example comparing efficiency analyses using grammatical/LoT complexity to IB.

Here is a summary of the changes I made:

Addition of an information_bottleneck submodule within ultk.effcomm, which includes an implementation of the IB-BA algorithm alone and an IBNamingModel.
RDOT is no longer a dependency
I have not added any tests. :/ I have a couple of very basic ones from RDOT, but I'm not sure how analytically 'true' they are so I just left them out
the signaling game example was a minimal example of how to use the agent.py module with adaptive dynamics together with the effcomm module, but ultimately I think its a little outside the scope of ULTK, at least for now. Also, it required RDOT because it involved computing rate distortion (not IB) bounds. Such style of analysis is controversial and I'd rather not have to defend it later on.
the color domain is infeasibly slow to compute our own bounds for. I have tried several code optimizations, including jax and torch trying to take advantage of GPU, but BA simply takes too long for a simple working example. Therefore, I think a better example showcasing the utility of ULK is to replicate the results of a different domain that's easier to compute.
In light of this, I added a modals example. This reads in natural languages (open to moving those out of this repo and doing url downloads from the modals-effcomm repo instead), performs the same analysis pipeline as the indefinites example, and then computes IB bounds for the domain based on the 'half-credit' communicative utility metric. We then get a nice comparison of the two approaches to efficient communication analysis of modals. I think from start to finish this notebook might take 1 minute to run.
I've been working quickly, so feel free to ask me to go back and make changes etc. Cheers!

To dict pretty print

…nguages

…ficial languages

…tener initialization of weights st all color languages have identical informativity

nathimel · 2024-11-23T04:13:09Z

A followup: I'm on the fence about whether to include the colors/ example at all, tbh. There is nothing novel in that notebook that is not covered by https://github.com/nogazs/ib-color-naming/tree/master. This doesn't mean there can't be: we can look at how to convert encoders into ULTK languages, and then measure informativity differently. However, (1) right now effcomm.informativity.informativity infers a binary_matrix by default for each language for its Speaker and Listener, which loses enormous information from the stochastic encoders $q(W|M)$, so making a few extra cells in the notebook to illustrate this will require changes to informativity submodule and to agent.py. Which is fine, but probably best left to another PR. (2) I empirically found measuring informativity to be quite slow (I think because currently we need to build a utility matrix for each call to effcomm.informativity.informativity, which is a (330, 330) matrix, and also we need to evaluate, for each expression in a Language, its Meaning mapping , which itself also requires a large FrozenDict ). On the bright side, the comparisons of diff language measures are present in the modals example.

nathimel · 2024-11-23T04:18:01Z

What if we replaced the images in the README with the two trade-off plots in the modals example? 🤔 The latter are both original to ULTK, instead of pulled from other papers which we have not exactly replicated in ULTK.

…ng-model

Nathaniel Imel and others added 30 commits February 5, 2024 16:52

change dist to tuple

4504143

mypy-induced cleaning

bd5b281

rename language.to_dict(), remove other to_dict() methods

d5baad5

roll back some to_dict to avoid yaml reading errors

4b10e2c

Automated black formatting

db19983

Merge pull request #36 from CLMBRs/to_dict-pretty_print

5d2a5c3

To dict pretty print

Trying the sampling to generate hypothetical languages, removed 15 la…

d07b166

…nguages

generated per-language color maps for debug

b43021f

Added IB curve back + more customization

a9a0a58

Added color term expansion to artificial languages, some cleanup

0afabe4

Added centroid color calculation, caching IB bound, settings for arti…

5264598

…ficial languages

Filter out color chips that don't meet the threshold

212582c

Split main script up by functionality

60f86d5

Merge branch 'main' into color-categories

125633c

Mostly code cleanup for merge

bec0f54

Automated black formatting

1399d7e

Merge branch 'main' into color-categories

3830973

beginning color refactor

8e33f88

reorder universe by chip number

ac73f71

scripts for color_universe, begin reading natural languages

b68da30

natural languages generated

7dfacf5

nat langs -> yaml, begin encoder

a4b4393

nat langs -> yaml, begin encoder

6b971eb

single language for testing

d1bb4a1

rm color universe pkl

2c3f568

color universe pkl -> yaml

9ae7307

rename cols in universe, compute meaning dists

608539d

remove unneeded variable

b0395b8

starting probability in ULTK

640c475

compute complexity and accuracy of nat langs

3ad8c8c

shanest and others added 10 commits September 20, 2024 12:33

some shape fixing stuff for lang -> info plane measurement

d3ac8ed

evaluation of bound takes 2 hours with just 30 points

37e3f15

set semantics _dist to False and add zkrt prior

8117dba

fix shape error

85ef05c

ultk informativity metrics seem to lose important info in Speaker Lis…

38289e4

…tener initialization of weights st all color languages have identical informativity

three days of BA is too much for color example

3d98afe

interesting

a1145f9

remove signaling

79036bc

add some notes

9bef938

update colors model binary

9e00f3f

nathimel requested a review from shanest November 23, 2024 04:04

Automated black formatting

eef58f3

Nathaniel Imel and others added 3 commits November 23, 2024 20:06

delete rate_distortion and clean up docs

194dbba

Merge branch 'ib-naming-model' of github.com:CLMBRS/altk into ib-nami…

934bbf1

…ng-model

Automated black formatting

e158cdd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ib naming model #50

Ib naming model #50

nathimel commented Nov 23, 2024

nathimel commented Nov 23, 2024

nathimel commented Nov 23, 2024

Ib naming model #50

Are you sure you want to change the base?

Ib naming model #50

Conversation

nathimel commented Nov 23, 2024

nathimel commented Nov 23, 2024

nathimel commented Nov 23, 2024