how to express 3 discrete latent codes (each with dimension 20) and visual work ok? #10

zdx3578 · 2016-09-24T02:55:43Z

how to express 10 dimensional categorical variables

code:
latent_spec = [
(Uniform(62), False),
(Categorical(10), True),
(Uniform(1, fix_std=True), True),
(Uniform(1, fix_std=True), True),
]
is for mnist ,
this is not enough,
in paper:
MNIST, we choose to model the latent codes with one categorical code, c1 ⇠ Cat(K = 10, p = 0.1), which can model discontinuous variation in data, and two continuous codes that can capture variations that are continuous in nature: c2 , c3 ⇠ Unif ( 1, 1).

but what to express: Street View House Number (SVHN
we make use of four 10 dimensional categorical variables and two uniform continuous variables as latent codes.

CelebA
In this dataset, we model the latent variation as 10 uniform categorical variables, each of dimension 10.

append c.3
generator G
Input 2 R228 228 how to get 228?

discriminator D / recognition network Q generator G
Input 32 ⇥ 32 Color image Input 2 R228
4 ⇥ 4 conv. 64 lRELU. stride 2 FC. 2 ⇥ 2 ⇥ 448 RELU. batchnorm
4 ⇥ 4 conv. 128 lRELU. stride 2. batchnorm 4 ⇥ 4 upconv. 256 RELU. stride 2. batchnorm 4 ⇥ 4 conv. 256 lRELU. stride 2. batchnorm 4 ⇥ 4 upconv. 128 RELU. stride 2.
FC. output layer for D,
FC.128-batchnorm-lRELU-FC.output for Q 4 ⇥ 4 upconv. 64 RELU. stride 2.
4 ⇥ 4 upconv. 3 Tanh. stride 2.

any one any help?
thanks very much !

zdx3578 · 2016-09-25T01:55:18Z

if isinstance(dist, Gaussian):
assert dist.dim == 1, "Only dim=1 is currently supported"
c_vals = []
for idx in xrange(10):
c_vals.extend([-1.0 + idx * 2.0 / 9] * 10)
c_vals.extend([0.] * (self.batch_size - 100))
vary_cat = np.asarray(c_vals, dtype=np.float32).reshape((-1, 1))
cur_cat = np.copy(fixed_cat)
cur_cat[:, offset:offset+1] = vary_cat
offset += 1
elif isinstance(dist, Categorical):
lookup = np.eye(dist.dim, dtype=np.float32)
cat_ids = []
for idx in xrange(10):
cat_ids.extend([idx] * 10)
cat_ids.extend([0] * (self.batch_size - 100))
cur_cat = np.copy(fixed_cat)
cur_cat[:, offset:offset+dist.dim] = lookup[cat_ids]
offset += dist.dim
elif isinstance(dist, Bernoulli):
assert dist.dim == 1, "Only dim=1 is currently supported"

zdx3578 · 2016-09-25T15:53:50Z

embedding_dim = 100

latent_spec = [
    (Uniform(64), False),
    (Categorical(32), True),
]
con_latent_spec = [
    (LatentGaussian(embedding_dim), True)
]

https://github.com/RutgersHan/InfoGAN/blob/dev_auto/launchers/generate_images.py

zdx3578 · 2016-10-01T09:16:50Z

C.3 CelebA
The network architectures are shown in Table 3. The discriminator D and the recognition network Q shares most of the network. For this task, we use 10 ten-dimensional categorical code and 128 noise variables, resulting in a concatenated dimension of 228.

is

latent_spec = [
    (Uniform(128), False),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
]

but how to config C.5 Chairs as below

The network architectures are shown in Table 6. The discriminator D and the recognition network Q shares the same network, and only have separate output units at the last layer. For this task, we use 1 continuous latent code, 3 discrete latent codes (each with dimension 20), and 128 noise variables, so the input to the generator has dimension 189.

elif isinstance(dist, Bernoulli):
assert dist.dim == 1, "Only dim=1 is currently supported"

NHDaly · 2016-10-01T16:33:51Z

The above latent_spec worked okay for me.

c3_celebA_latent_spec = [
    (Uniform(128), False),  # Noise
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
    (Categorical(10), True),
]
c3_celebA_image_size = 32

Can you elaborate a bit more in words what you're having problems with? I'm not sure I understand what's not working for you.

zdx3578 · 2016-10-03T08:29:38Z

thanks NHDaly ! can share your code?

i think celebA config is ok, now question is about how to config C.5 Chairs as below

The network architectures are shown in Table 6. The discriminator D and the recognition network Q shares the same network, and only have separate output units at the last layer. For this task, we use 1 continuous latent code, 3 discrete latent codes (each with dimension 20), and 128 noise variables, so the input to the generator has dimension 189.

3 discrete latent codes (each with dimension 20) but visual code is below:
elif isinstance(dist, Bernoulli):
assert dist.dim == 1, "Only dim=1 is currently supported"

We used separate configurations for each learned variation, shown in Table 7. For this task, we found it necessary to use different regularization coefficients for the continuous and discrete latent codes.

so how to config c.5 chairs,and how to change visual code ? or visual code wont change??

and 2 question is :
in C.4 Faces
The network architectures are shown in Table 4. The discriminator D and the recognition network Q shares the same network, and only have separate output units at the last layer. For this task, we use 5 continuous latent codes and 128 noise variables, so the input to the generator has dimension 133.
We used separate configurations for each learned variation, shown in Table 5.

how to config 'separate configurations for each learned variation' ??

NHDaly · 2016-10-04T05:02:21Z

how to config C.5 Chairs as below

I might be misunderstanding, but it seems like

For this task, we use 1 continuous latent code, 3 discrete latent codes (each with dimension 20), and 128 noise variables, so the input to the generator has dimension 189.

would translate to the following latent_spec. That is, the continuous code is represented by Uniform and the discrete code is represented by Categorical:

c5_chairs_latent_spec = [
    (Uniform(128), False),  # Noise
    (Uniform(1, fix_std=True), True),
    (Categorical(20), True),
    (Categorical(20), True),
    (Categorical(20), True),
]
c3_celebA_image_size = 32

I copied the (Uniform(1, fix_std=True), True) line from the two continuous variables defined in run_mnist_exp.py, which I believe represent the "2 continuous latent codes" referenced from the MNIST section of the paper.

I'm not sure where you got the LatentGaussian from... I don't know if it's necessary? I haven't tried running the Chairs model at all.

NHDaly · 2016-10-04T05:04:42Z

That said, I am also very curious about the answer to this question:

how to config 'separate configurations for each learned variation' ?

Does this mean that you ran the experiment multiple times with the same number of codes, but each of the codes tends to perform best for each of the provided settings?

neocxi · 2016-10-04T05:36:49Z

That is, the continuous code is represented by Uniform and the discrete code is represented by Categorical:

This is correct. Thanks @NHDaly !

Does this mean that you ran the experiment multiple times with the same number of codes, but each of the codes tends to perform best for each of the provided settings?

Yes, to better compare with previous supervised results, we select codes from multiple runs that are most similar to categories that previous method (DC-IGN) produces.

zdx3578 · 2016-10-04T10:31:49Z

for @NHDaly ref this https://github.com/RutgersHan/InfoGAN/blob/dev_auto/launchers/run_flower_exp.py#L49

is your celeba train result is ok?

for @neocxi 1
self.reg_cont_latent_dist = Product([x for x in self.reg_latent_dist.dists if isinstance(x, Gaussian)])
self.reg_disc_latent_dist = Product([x for x in self.reg_latent_dist.dists if isinstance(x, (Categorical, Bernoulli))])
Bernoulli is also discrete. where to use Bernoulli?

2
can @neocxi give a example show which parameter is according to ?
is info_reg_coeff=1.0, parameter??

3 what cause NAN error? D and G learning rate not equilibrium??

Epoch 14 | discriminator_loss: 0.128064; generator_loss: 2.78964; MI_disc: 20.3559; CrossEnt_disc: 2.66993; MI: 20.3559; CrossEnt: 2.66993; max_real_d: 0.999938; min_real_d: 0.560705; max_fake_d: 0.240968; min_fake_d: 0.0144349
STR: 'avg_log_vals' is [ 1.28064305e-01 2.78963685e+00 2.03559246e+01 2.66993141e+00
2.03559246e+01 2.66993141e+00 9.99938190e-01 5.60704947e-01
2.40968212e-01 1.44348787e-02]
STR: 'ganlp' is 2 |ETA: --:--:--
Epoch 15 | discriminator_loss: nan; generator_loss: nan; MI_disc: nan; CrossEnt_disc: nan; MI: nan; CrossEnt: nan; max_real_d: -inf; min_real_d: inf; max_fake_d: -inf; min_fake_d: inf
STR: 'avg_log_vals' is [ nan nan nan nan nan nan -inf inf -inf inf]
Traceback (most recent call last):
File "launchers/run_mnist_exp.py", line 97, in
algo.train()
File "/home/ubuntu/wordk/InfoGAN/infogan/algos/infogan_trainer.py", line 335, in train
raise ValueError("NaN detected!")
ValueError: NaN detected!

4 celeba train need how long ? epoch log can share ?

log d loss very small g loss bigger

zdx3578 · 2016-11-22T13:29:14Z

A.2 INFOGAN TRAINING
To train the InfoGAN network described in Tbl. 1 on the 2D shapes dataset (Fig. 6), we followed the training paradigm described in Chen et al. (2016) with the following modifications. For the mutual information regularised latent code, we used 5 continuous variables ci sampled uniformly from ( 1,1). We used 5 noise variables zi, as we found that using a reduced number of noise variables improved the quality of generated samples for this dataset. To help stabilise training, we used the instance noise trick described in Shi et al. (2016), adding Gaussian noise to the discriminator inputs (0.2 standard deviation on images scaled to [ 1, 1]). We followed Radford et al. (2015) for the architecture of the convolutional layers, and used batch normalisation in all layers except the last in the generator and the first in the discriminator.

from
beta-VAE: LEARNING BASIC VISUAL CONCEPTS WITH A CONSTRAINED VARIATIONAL FRAMEWORK

simonzhang0158 · 2017-05-23T18:17:22Z

I am wondering if there is any update about CelebA dataset. I have the same problem as @zdx3578 when I am trying to train CelebA. @zdx3578 Did you find a way to solve this? The paper you mentioned above I believe is the set up for 2D shapes dataset.

Any help will be appreciate.

zdx3578 changed the title ~~how to express 10 dimensional categorical variables~~ how to express 3 discrete latent codes (each with dimension 20) and visual work ok? Oct 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to express 3 discrete latent codes (each with dimension 20) and visual work ok? #10

how to express 3 discrete latent codes (each with dimension 20) and visual work ok? #10

zdx3578 commented Sep 24, 2016 •

edited

Loading

zdx3578 commented Sep 25, 2016

zdx3578 commented Sep 25, 2016 •

edited

Loading

zdx3578 commented Oct 1, 2016 •

edited

Loading

NHDaly commented Oct 1, 2016

zdx3578 commented Oct 3, 2016 •

edited

Loading

NHDaly commented Oct 4, 2016

NHDaly commented Oct 4, 2016

neocxi commented Oct 4, 2016

zdx3578 commented Oct 4, 2016 •

edited

Loading

zdx3578 commented Nov 22, 2016

simonzhang0158 commented May 23, 2017

how to express 3 discrete latent codes (each with dimension 20) and visual work ok? #10

how to express 3 discrete latent codes (each with dimension 20) and visual work ok? #10

Comments

zdx3578 commented Sep 24, 2016 • edited Loading

zdx3578 commented Sep 25, 2016

zdx3578 commented Sep 25, 2016 • edited Loading

zdx3578 commented Oct 1, 2016 • edited Loading

NHDaly commented Oct 1, 2016

zdx3578 commented Oct 3, 2016 • edited Loading

NHDaly commented Oct 4, 2016

NHDaly commented Oct 4, 2016

neocxi commented Oct 4, 2016

zdx3578 commented Oct 4, 2016 • edited Loading

zdx3578 commented Nov 22, 2016

simonzhang0158 commented May 23, 2017

zdx3578 commented Sep 24, 2016 •

edited

Loading

zdx3578 commented Sep 25, 2016 •

edited

Loading

zdx3578 commented Oct 1, 2016 •

edited

Loading

zdx3578 commented Oct 3, 2016 •

edited

Loading

zdx3578 commented Oct 4, 2016 •

edited

Loading