Data type diversity

Started by
47 comments, last by Calin 1 year, 10 months ago

The int/bool thing is a C/C++ legacy issue. It's a classic language design mistake to not have a bool type. If you have to retrofit one later, the semantics are messy.

Pascal had PACKED ARRAY, and if you created a PACKED ARRAY of bool, you got a bit array.

Rust has a bit array crate with reasonably good optimization, and a bool type. Also enforced enums, where safe code cannot get an invalid value into an enum.

Advertisement

Nagle said:
It's a classic language design mistake to not have a bool type. If you have to retrofit one later, the semantics are messy.

I`m confuzed, like make a new programming language with no bool type variable in it, and then try to add the bool type to a program`s source using the language without editing the language source code? yeah I guess that can turn out messy.

My project`s facebook page is “DreamLand Page”

I`m confused, like make a new programming language with no bool type variable in it, and then try to add the bool type to a program`s source using the language without editing the language source code?

Both C and Python started out without a bool type, and added one later. Here's an overview of the pain when it was added to Python in 2002. Retrofits like that produce some strange results. In Python, “True + True” is 2. That kind of thing. Then there's a history of people creating constants “true”, “TRUE", or “True” in languages that don't have a “bool” type.

“bool” was added to C in C99, with similar problems.

So, if there will ever be a “bool” type, put it in before people write code in the language.

Nagle said:
Both C and Python started out without a bool type, and added one later.

So it`s an addon to the actual language but it doesn`t integrate too well with all the other components of the language because it wasn`t thought as part of the language since the begging and the new version of the language had to be backwards compatible (any existing C or Python programs at the time the addition took place had to be supported as well)

My project`s facebook page is “DreamLand Page”

MagForceSeven said:

Calin said:
the diversity is the problem as far as I`m concerned.

Diversity is the whole point and ultimately beneficial to those that can control it. The more types you have the more control you have over each type and the assumptions you can build into the compilers work or other automated tools.

I could have a handle that's just an int, but if I make a new type (maybe it even just wraps an int) I can both prevent people from making trivial mistakes as well as from doing something malicious. Even if I've got two different systems that need handles, they each get their own type so that you couldn't accidentally interchange them.

Types that one would ideally use in a program are usually approximated loosely, leaving some room for errors; any language construct that allows a more precise approximation, like a Boolean type to represent Boolean values, eliminates some otherwise possible errors and represents an useful improvement,

For example, consider array indexes: in all popular languages, they are restricted to be integers because anything else makes no sense; in C they are signed integers in a certain range (negative offsets must be allowed); in other languages they are unsigned, but possibly past the end of the array (for example, the index into a three element array has at least two bits and therefore an invalid value); type “integer in a user-specified range” is typically available only with some user effort (e.g. using a special-purpose library) in complex languages; only in very advanced cases there's language support for some compile-time range checking (i.e. known array sizes imply restricted types for the numbers that are used as indices).

Omae Wa Mou Shindeiru

MagForceSeven said:
Diversity is the whole point and ultimately beneficial to those that can control it. The more types you have

Depends what your purpose is. Diversity is good for fine tuning and optimisation. But it`s hard to learn for newcomers and also very demanding and expensive. When you have fewer blocks to build with (type wise) you probably you can`t stack them as high but it`s easier to understand how stacking works and you can get much faster doing your own stuff.

My statement is related not only to types but also different ways of doing one and the same thing.

My project`s facebook page is “DreamLand Page”

Calin said:
Diversity is good for fine tuning and optimisation. But it`s hard to learn for newcomers and also very demanding and expensive.

This is always the case. Usually the lower-level the construct, the more the language must assume the programmer knows.

At the assembly programming level, the assembler assumes the programmer knows all the risks and is doing everything right.

In languages like C and C++ the compiler can do an awful lot of hand-holding and detects many conditions with diagnostic warnings, but still ultimately assumes that if the programmer says “do it” the programmer knows what they're saying. Sometimes the costs of the abstractions are high, like with Boolean values versus bools and with bit manipulation even if the hardware doesn't support it directly. Even so, it provides some options and enough to usually get the job done.

As you get even more abstract in languages, there are tools and languages that will do the “correct” thing even if they're terrible at the hardware level. You'll have a type and get into enormous hidden conversions as seen in languages like Python or JavaScript. Rust offers lots of protections in memory and for expressions, but comes at an implementation cost. They're easy for beginners to approach but the naive approach can have unexpected performance problems. In all of these langauges it's easy to write code that hums along in one scenario, then suddenly falls off the performance cliff with no obvious reasons why.

Drawing back to C++ and the original topic, that's a tradeoff the language has made. The original language didn't deal with Boolean logic, it was made as a portable way to program at a higher level than machine code. Thus, the original language dealt with a char as the smallest addressable unit, and if you wanted bit manipulation you could do it. Adding Boolean logic meant additional rules around conversions and promotions, and the decision to keep it as a single addressable object meant requiring a full byte (or more) to address what could be stored in a single bit. That's potentially wasteful from a memory perspective but good for processing performance, and it ensures correct behavior of Boolean logic. The tradeoff is a bitfield which is trivially implemented but requires additional processing work, sometimes significant processing work, in order to regain that space in the time/space exchange. The compiler still holds your hand and ensures correct behavior, but you're exchanging storage space for processing time, a common optimization choice.

Exchanging between the two requires more knowledge and is less beginner friendly, but offers all the options to those who know to take them. The default and easier to use approach of the bool data type encoding Boolean logic, taking a single machine addressable byte which also typically means 7 bits wasted space but faster processing is usually the right approach.

Calin said:
Diversity is good for fine tuning and optimisation.

I wouldn't call optimization the primary purpose of having many types specific to different use cases, nor of the type system or type safety as a concept. The purpose of type safety is writing correct code by making errors happen at compile-time where they are much less expensive to fix, rather than runtime where they may appear on the user's machine instead of yours. It has also been argued that having types visible in the source code enhances readability, as well. Sometimes optimizations can result from the use of the type system, it's true, but I've yet to meet a programmer who primarily valued the type system for its optimization potential.

frob said:
This is always the case. Usually the lower-level the construct, the more the language must assume the programmer knows. At the assembly programming level, the assembler assumes the programmer knows all the risks and is doing everything right.

[edit]this is only remotely related to what you`re saying but one thing I noticed is that past a certain level things in c++ begin to look a lot like scripting. At the functions and pointers and classes level it looks like standard programming but once you get past that it starts to feel like scripting within c++ itself

My project`s facebook page is “DreamLand Page”

Calin said:

Depends what your purpose is. Diversity is good for fine tuning and optimisation. But it`s hard to learn for newcomers and also very demanding and expensive. When you have fewer blocks to build with (type wise) you probably you can`t stack them as high but it`s easier to understand how stacking works and you can get much faster doing your own stuff.

My statement is related not only to types but also different ways of doing one and the same thing.

That's not how it works.

Types are the basic building blocks of C++ programs. ~90% of C++ programming consists of creating new types (classes, enums) from existing types. A non-trivial C++ program will have thousands of types. The bool type is a drop in the bucket compared to that.

Does bool need to be a built-in type? Not really, it could be a library enum. Is it necessary? It's one of the first things a competent programmer would build if the language didn't provide it. Does it make C++ harder to learn? No, it makes it easier, because it means that the beginner can use the bool type without first learning how to implement it themselves.

This topic is closed to new replies.

Advertisement