Integer data type #91

timmoon10 · 2017-06-28T02:08:48Z

As of d2c414a, int is the standard integer data type. However, we may need 64-bit integers for very large matrices. Variables susceptible to overflow should be changed to El::Int. As a rule of thumb, indices into Elemental matrices should be El::Ints.

The text was updated successfully, but these errors were encountered:

ndryden · 2017-06-28T03:56:41Z

With these changes, I'm getting a lot of warnings about narrowing casts.

ndryden · 2017-06-28T04:42:43Z

(Background/refresher: fundamental C++ types and fixed-width integer types)

Just to add some context for this discussion, these are our main sources/issues with integers, as far as I can recall:

Elemental defines El::Int to be either int or long long depending on our compile-time flags. (There is also El::Unsigned.) We want to use El::Int when interacting with Elemental matrices.
The C++ STL (e.g. std::vector) tends to use size_t for just about any size-related quantity. This can lead to us either narrowing a size_t or comparing signed/unsigned integers.
cuDNN expects parameters to be int.
MPI expects parameters to be int.

Ideally, we should come up with a consistent use of integers that satisfies all of these.

Edit: An additional thought: while we don't want to do it for production, we could compile with -ftrapv, which will trap for signed overflow on addition, subtraction, and multiplication.

* Add environment variable LBANN_NUM_IO_PARTITIONS Specify the number of partitions in the depth dimension of the Cosmoflow samples. * Adjust the base offset for parallel sample I/O * WIP: Further adjustment of sample sizes * WIP: sample size adjustment * WIP: sample size adjustment * Remove debug output * Cosmoflow parallel io (LBANN#86) * before rebase * updating * updating * small changes * moving around where data is read in NOT DONE YET * updated some comments and some todo * cleaning up * added comm member variabe * cleaning up * compiles, fixes stray variables and typos, adds correct member variables * changing responses to float,taking away division, removing from image_data_reader * fixing a mistake * oops, changing m_all_responses back to float * changed some variable names, changed indenting and fixed vim problems, fixed file access * fixed duplicate count * transposed dimensions, reverted resnet, took odd spacing and print statements out * removed timing * fixing comments and spacing * Missing semicolon * Fix type mismatch * Remove trailing whitespaces * HDF5 bug fixes * Size adjustment fix * Refactoring * Support strided rank ordering * Fix hang in HDF5 MPI-IO HDF5 caused hanging. Likely because a HDF5 property was created with MPI at every fetch_datum. The property is now moved out of the function and is only done once, so it should not hang anymore. Yet, MPI-IO is disabled for now. Should be looked into again once everything becomes working. * Disables assertion This assertion fails when the last mini-batch is not a full one. Not sure why it fails now and not before. * Use normalized parameters in Cosmoflow * Fix copying of a non-halo-expanded host tensor to a halo-extended device tensor. The distconv::Copy function doesn't seem to be working correctly, though more comprehensive investigation is needed. * Enable assertion check on mini-batch size again * Disable debug output * Temporary add debug dump in generic_input_layer * Delete irelevant comment * Fix response value loading when rank reordering is not used * Formatting * Fix protobuf version in superbuild * Cleanup before merging to the mainline branch * Further cleanup * Check if int16 input is enabled

timmoon10 self-assigned this Jun 28, 2017

timmoon10 mentioned this issue Jun 28, 2017

Normalize constructor arguments #88

Closed

This was referenced Aug 3, 2018

Replacing El::Int with an IntType typedef. #567

Closed

Replacing int with IntType in communicator header. #568

Closed

timmoon10 mentioned this issue Sep 15, 2018

Batch norm refactor #620

Merged

timmoon10 added the refactor label Nov 5, 2018

timmoon10 mentioned this issue May 1, 2019

New preprocessing pipeline #1014

Merged

timmoon10 mentioned this issue Aug 1, 2019

change 'int' to 'size_t' to accommodate large sample sizes #1139

Merged

timmoon10 mentioned this issue Sep 9, 2019

Change the fp_setup_ouput function in layer class to use size_t #1224

Open

timmoon10 mentioned this issue Oct 10, 2019

Fix for case where global archive size is > INT_MAX #1288

Merged

timmoon10 mentioned this issue Mar 9, 2021

Handle weights with dims >2.1B #1819

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integer data type #91

Integer data type #91

timmoon10 commented Jun 28, 2017

ndryden commented Jun 28, 2017

ndryden commented Jun 28, 2017 •

edited

Loading

Integer data type #91

Integer data type #91

Comments

timmoon10 commented Jun 28, 2017

ndryden commented Jun 28, 2017

ndryden commented Jun 28, 2017 • edited Loading

ndryden commented Jun 28, 2017 •

edited

Loading