Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

Issue loading dataset #7

Open
iandewancker opened this issue May 6, 2016 · 9 comments
Open

Issue loading dataset #7

iandewancker opened this issue May 6, 2016 · 9 comments

Comments

@iandewancker
Copy link

Hey there I am playing around with the "cifar10_msra.py" example and ran into a snag running the Imageloading

In [15]: train = ImageLoader(set_name='train', shuffle=True, do_transforms=True, **imgset_options)
libdc1394 error: Failed to initialize libdc1394
---------------------------------------------------------------------------
ArgumentError                             Traceback (most recent call last)
<ipython-input-15-c033fd957d22> in <module>()
----> 1 train = ImageLoader(set_name='train', shuffle=True, do_transforms=True, **imgset_options)

/usr/local/lib/python2.7/dist-packages/neon/data/imageloader.pyc in __init__(self, repo_dir, inner_size, scale_range, do_transforms, rgb, shuffle, set_name, subset_pct, nlabels, macro, contrast_range, aspect_ratio)
    105                                           target_size=1, reshuffle=shuffle,
    106                                           nclasses=self.nclass,
--> 107                                           subset_percent=subset_pct)
    108
    109     def configure(self, repo_dir, set_name, subset_pct):

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.pyc in __init__(self, set_name, repo_dir, media_params, target_size, index_file, shuffle, reshuffle, datum_dtype, target_dtype, onehot, nclasses, subset_percent, ingest_params)
     85         self.ingest_params = ingest_params
     86         self.load_library()
---> 87         self.alloc()
     88         self.start()
     89         atexit.register(self.stop)

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.pyc in alloc(self)
    110             return BufferPair(ct_cast(buffers, 0), ct_cast(buffers, 1))
    111
--> 112         self.data = alloc_bufs(self.datum_size, self.datum_dtype)
    113         self.targets = alloc_bufs(self.target_size, self.target_dtype)
    114         self.device_params = DeviceParams(self.be.device_type,

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.pyc in alloc_bufs(dim0, dtype)
    102
    103         def alloc_bufs(dim0, dtype):
--> 104             return [self.be.iobuf(dim0=dim0, dtype=dtype) for _ in range(2)]
    105
    106         def ct_cast(buffers, idx):

/usr/local/lib/python2.7/dist-packages/neon/backends/backend.pyc in iobuf(self, dim0, x, dtype, name, persist_values, shared, parallelism)
    549
    550         if persist_values and shared is None:
--> 551             out_tsr[:] = 0
    552
    553         return out_tsr

/usr/local/lib/python2.7/dist-packages/neon/backends/nervanagpu.pyc in __setitem__(self, index, value)
    178     def __setitem__(self, index, value):
    179
--> 180         self.__getitem__(index)._assign(value)
    181
    182     def __getitem__(self, index):

/usr/local/lib/python2.7/dist-packages/neon/backends/nervanagpu.pyc in _assign(self, value)
    339                 if self.dtype.itemsize == 1:
    340                     drv.memset_d8_async(
--> 341                         self.gpudata, unpack_from('B', value)[0], self.size, stream)
    342                 elif self.dtype.itemsize == 2:
    343                     drv.memset_d16_async(

ArgumentError: Python argument types in
    pycuda._driver.memset_d8_async(NoneType, int, int, NoneType)
did not match C++ signature:
    memset_d8_async(unsigned long long dest, unsigned char data, unsigned int size, pycudaboost::python::api::object stream=None)

Any ideas what I could be doing wrong here?

@apark263
Copy link

apark263 commented May 6, 2016

did you create image batches for the dataset first?

if not, then you will need to create them first. If you did, then it might
be helpful to know the command line arguments you are supplying to the
script

On Fri, May 6, 2016 at 11:12 AM, Ian Dewancker [email protected]
wrote:

Hey there I am playing around with the "cifar10_msra.py" example and ran
into a snag running the Imageloading

In [15]: train = ImageLoader(set_name='train', shuffle=True, do_transforms=True, **imgset_options)

libdc1394 error: Failed to initialize libdc1394

ArgumentError Traceback (most recent call last)
in ()
----> 1 train = ImageLoader(set_name='train', shuffle=True, do_transforms=True, **imgset_options)

/usr/local/lib/python2.7/dist-packages/neon/data/imageloader.pyc in init(self, repo_dir, inner_size, scale_range, do_transforms, rgb, shuffle, set_name, subset_pct, nlabels, macro, contrast_range, aspect_ratio)
105 target_size=1, reshuffle=shuffle,
106 nclasses=self.nclass,
--> 107 subset_percent=subset_pct)
108
109 def configure(self, repo_dir, set_name, subset_pct):

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.pyc in init(self, set_name, repo_dir, media_params, target_size, index_file, shuffle, reshuffle, datum_dtype, target_dtype, onehot, nclasses, subset_percent, ingest_params)
85 self.ingest_params = ingest_params
86 self.load_library()
---> 87 self.alloc()
88 self.start()
89 atexit.register(self.stop)

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.pyc in alloc(self)
110 return BufferPair(ct_cast(buffers, 0), ct_cast(buffers, 1))
111
--> 112 self.data = alloc_bufs(self.datum_size, self.datum_dtype)
113 self.targets = alloc_bufs(self.target_size, self.target_dtype)
114 self.device_params = DeviceParams(self.be.device_type,

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.pyc in alloc_bufs(dim0, dtype)
102
103 def alloc_bufs(dim0, dtype):
--> 104 return [self.be.iobuf(dim0=dim0, dtype=dtype) for _ in range(2)]
105
106 def ct_cast(buffers, idx):

/usr/local/lib/python2.7/dist-packages/neon/backends/backend.pyc in iobuf(self, dim0, x, dtype, name, persist_values, shared, parallelism)
549
550 if persist_values and shared is None:
--> 551 out_tsr[:] = 0
552
553 return out_tsr

/usr/local/lib/python2.7/dist-packages/neon/backends/nervanagpu.pyc in setitem(self, index, value)
178 def setitem(self, index, value):
179
--> 180 self.getitem(index)._assign(value)
181
182 def getitem(self, index):

/usr/local/lib/python2.7/dist-packages/neon/backends/nervanagpu.pyc in _assign(self, value)
339 if self.dtype.itemsize == 1:
340 drv.memset_d8_async(
--> 341 self.gpudata, unpack_from('B', value)[0], self.size, stream)
342 elif self.dtype.itemsize == 2:
343 drv.memset_d16_async(

ArgumentError: Python argument types in
pycuda._driver.memset_d8_async(NoneType, int, int, NoneType)
did not match C++ signature:
memset_d8_async(unsigned long long dest, unsigned char data, unsigned int size, pycudaboost::python::api::object stream=None)

Any ideas what I could be doing wrong here?


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#7

@iandewancker
Copy link
Author

Sure, I ran this command :
./neon/neon/data/batch_writer.py --set_type cifar10 --data_dir "data" --macro_size 10000 --target_size 40

from '/home/ubuntu' dir, where the neon repo is also checked out.

Then in an ipython started from the same location I'm running

from neon.initializers import Kaiming, IdentityInit
from neon.layers import Conv, Pooling, GeneralizedCost, Affine, Activation
from neon.layers import MergeSum, SkipNode
from neon.optimizers import GradientDescentMomentum, Schedule
from neon.transforms import Rectlin, Softmax, CrossEntropyMulti, Misclassification
from neon.models import Model
from neon.data import ImageLoader
from neon.callbacks.callbacks import Callbacks, MetricCallback
from neon.backends import gen_backend
import sigopt.interface
import time

gen_backend(backend='gpu')

# load datasets
DATA_DIR_PATH = "/home/ubuntu/data/"
imgset_options = dict(inner_size=32, scale_range=40, aspect_ratio=110,
                      repo_dir=DATA_DIR_PATH, subset_pct=100)
train = ImageLoader(set_name='train', shuffle=True, do_transforms=True, **imgset_options)

@apark263
Copy link

apark263 commented May 6, 2016

hmm... that is a strange one.

could you try changing line 104 on /usr/local/lib/python2.7/dist-packages/neon/data/dataloader.py to instead return

return [self.be.iobuf(dim0=dim0, dtype=dtype, persist_values=False) for _ in range(2)]

@iandewancker
Copy link
Author

Hmm maybe got further:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-c033fd957d22> in <module>()
----> 1 train = ImageLoader(set_name='train', shuffle=True, do_transforms=True, **imgset_options)

/usr/local/lib/python2.7/dist-packages/neon/data/imageloader.pyc in __init__(self, repo_dir, inner_size, scale_range, do_transforms, rgb, shuffle, set_name, subset_pct, nlabels, macro, contrast_range, aspect_ratio)
    105                                           target_size=1, reshuffle=shuffle,
    106                                           nclasses=self.nclass,
--> 107                                           subset_percent=subset_pct)
    108
    109     def configure(self, repo_dir, set_name, subset_pct):

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.py in __init__(self, set_name, repo_dir, media_params, target_size, index_file, shuffle, reshuffle, datum_dtype, target_dtype, onehot, nclasses, subset_percent, ingest_params)
     85         self.ingest_params = ingest_params
     86         self.load_library()
---> 87         self.alloc()
     88         self.start()
     89         atexit.register(self.stop)

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.py in alloc(self)
    115         self.device_params = DeviceParams(self.be.device_type,
    116                                           self.be.device_id,
--> 117                                           cast_bufs(self.data),
    118                                           cast_bufs(self.targets))
    119         if self.onehot:

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.py in cast_bufs(buffers)
    109
    110         def cast_bufs(buffers):
--> 111             return BufferPair(ct_cast(buffers, 0), ct_cast(buffers, 1))
    112
    113         self.data = alloc_bufs(self.datum_size, self.datum_dtype)

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.py in ct_cast(buffers, idx)
    106
    107         def ct_cast(buffers, idx):
--> 108             return ct.cast(int(buffers[idx].raw()), ct.c_void_p)
    109
    110         def cast_bufs(buffers):

TypeError: int() argument must be a string or a number, not 'NoneType'

@apark263
Copy link

apark263 commented May 6, 2016

hmm...

have you been able to run any other neon examples (e.g. cifar_conv.py in
the examples directory)? which gpu do you have and which version of pycuda?

thanks,

On Fri, May 6, 2016 at 11:43 AM, Ian Dewancker [email protected]
wrote:

Hmm maybe got further:


TypeError Traceback (most recent call last)
in ()
----> 1 train = ImageLoader(set_name='train', shuffle=True, do_transforms=True, **imgset_options)

/usr/local/lib/python2.7/dist-packages/neon/data/imageloader.pyc in init(self, repo_dir, inner_size, scale_range, do_transforms, rgb, shuffle, set_name, subset_pct, nlabels, macro, contrast_range, aspect_ratio)
105 target_size=1, reshuffle=shuffle,
106 nclasses=self.nclass,
--> 107 subset_percent=subset_pct)
108
109 def configure(self, repo_dir, set_name, subset_pct):

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.py in init(self, set_name, repo_dir, media_params, target_size, index_file, shuffle, reshuffle, datum_dtype, target_dtype, onehot, nclasses, subset_percent, ingest_params)
85 self.ingest_params = ingest_params
86 self.load_library()
---> 87 self.alloc()
88 self.start()
89 atexit.register(self.stop)

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.py in alloc(self)
115 self.device_params = DeviceParams(self.be.device_type,
116 self.be.device_id,
--> 117 cast_bufs(self.data),
118 cast_bufs(self.targets))
119 if self.onehot:

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.py in cast_bufs(buffers)
109
110 def cast_bufs(buffers):
--> 111 return BufferPair(ct_cast(buffers, 0), ct_cast(buffers, 1))
112
113 self.data = alloc_bufs(self.datum_size, self.datum_dtype)

/usr/local/lib/python2.7/dist-packages/neon/data/dataloader.py in ct_cast(buffers, idx)
106
107 def ct_cast(buffers, idx):
--> 108 return ct.cast(int(buffers[idx].raw()), ct.c_void_p)
109
110 def cast_bufs(buffers):

TypeError: int() argument must be a string or a number, not 'NoneType'


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#7 (comment)

@iandewancker
Copy link
Author

I'm trying to run this on an AWS g2.2xlarge machine, which uses a GK104GL [GRID K520] from NVIDIA. pycuda version looks to be 2016.1 [5]: pycuda.VERSION Out[5]: (2016, 1)

get an error trying the cifar_conv example as well

ubuntu@ip-172-31-46-136:~/neon/examples$ python cifar10_conv.py
2016-05-06 18:59:26,618 - neon.backends.nervanagpu - WARNING - Neon is highly optimized for Maxwell GPUs. Although you might get speedups over CPUs, note that you are running on a pre-Maxwell GPU and you might not experience the fastest performance. For faster performance using the Nervana Cloud contact [email protected]
Downloading file: /home/ubuntu/nervana/data/cifar-10-python.tar.gz
Download Progress |██████████████████████████████████████████████████| Download Complete
Traceback (most recent call last):
  File "cifar10_conv.py", line 73, in <module>
    mlp.fit(train, optimizer=opt_gdm, num_epochs=num_epochs, cost=cost, callbacks=callbacks)
  File "/usr/local/lib/python2.7/dist-packages/neon/models/model.py", line 149, in fit
    self._epoch_fit(dataset, callbacks)
  File "/usr/local/lib/python2.7/dist-packages/neon/models/model.py", line 179, in _epoch_fit
    self.bprop(delta)
  File "/usr/local/lib/python2.7/dist-packages/neon/models/model.py", line 211, in bprop
    return self.layers.bprop(delta)
  File "/usr/local/lib/python2.7/dist-packages/neon/layers/container.py", line 207, in bprop
    error = l.bprop(error)
  File "/usr/local/lib/python2.7/dist-packages/neon/layers/layer.py", line 654, in bprop
    alpha=alpha, beta=beta)
  File "/usr/local/lib/python2.7/dist-packages/neon/backends/nervanagpu.py", line 1652, in bprop_conv
    layer.bprop_kernels.bind_params(E, F, grad_I, alpha, beta, bsum)
  File "/usr/local/lib/python2.7/dist-packages/neon/backends/convolution.py", line 293, in bind_params
    assert bsum is not None, "must use initialized bsum config"
AssertionError: must use initialized bsum config

@iandewancker
Copy link
Author

This was my install script if that is helpful

sudo apt-get update && sudo apt-get -yq upgrade
sudo apt-get install python-dev
sudo apt-get install -y libopencv-dev python-opencv libhdf5-dev
#sudo apt-get install -yq linux-image-extra-`uname -r`
sudo apt-get -y install git

sudo pip install -q --upgrade pip
sudo pip install -U numpy
sudo pip install -U scipy
sudo pip install scikit-learn==0.17 joblib sigopt pystache awscli
sudo pip install --upgrade pillow
sudo apt-get install libjpeg-dev zlib1g-dev

wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
sudo apt-get update
sudo apt-get -yq install cuda

git clone https://github.com/NervanaSystems/neon.git
cd neon && sudo make sysinstall
sudo ln -sf /usr/local/cuda-7.5/bin/nvcc /usr/bin/nvcc
export LD_LIBRARY_PATH=/usr/local/cuda-7.5/lib64:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-7.5/bin:$PATH

@apark263
Copy link

apark263 commented May 6, 2016

ah ok -- it's a non-maxwell card. i guess there are still some issues for
running dataloader dependent examples (cifar_msra) on kepler cards. Seems
like the device buffer for storing data and targets is not getting
allocated as it should We will take a look at those.

in the meantime, the bsum AssertionError on the cifar_conv example can be
fixed by supplying -r 0 on the command line

On Fri, May 6, 2016 at 12:05 PM, Ian Dewancker [email protected]
wrote:

This was my install script if that is helpful

sudo apt-get update && sudo apt-get -yq upgrade
sudo apt-get install python-dev
sudo apt-get install -y libopencv-dev python-opencv libhdf5-dev
#sudo apt-get install -yq linux-image-extra-uname -r
sudo apt-get -y install git

sudo pip install -q --upgrade pip
sudo pip install -U numpy
sudo pip install -U scipy
sudo pip install scikit-learn==0.17 joblib sigopt pystache awscli
sudo pip install --upgrade pillow
sudo apt-get install libjpeg-dev zlib1g-dev

wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
sudo apt-get update
sudo apt-get -yq install cuda

git clone https://github.com/NervanaSystems/neon.git
cd neon && sudo make sysinstall
sudo ln -sf /usr/local/cuda-7.5/bin/nvcc /usr/bin/nvcc
export LD_LIBRARY_PATH=/usr/local/cuda-7.5/lib64:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-7.5/bin:$PATH


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#7 (comment)

@iandewancker
Copy link
Author

Thanks for the help! Any chance an earlier version of neon might work better with the Kepler GPUs?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants