Comment by esafak

Comment by esafak 3 months ago

No, they are not. Model outputs can be discretized but the model parameters (excluding hyperparameters) are typically continuous. That's why we can use gradient descent.

bob1029 3 months ago

Where are the model parameters stored and how are they represented?

Reply View 4 replies

esafak 3 months ago

In disk or memory as multidimensional arrays ("tensors" in ML speak).

Reply View | 3 replies
- bob1029 3 months ago
  
  Do we agree that these memories consist of a finite # of bits?
  
  Reply View | 2 replies
  
  esafak 3 months ago
  
  Yes, of course.
  Consider a toy model with just 1000 double (64-bit), or 64Kb parameters. If you're going to randomly flip bits over this 2^64K search space while you evaluate a nontrivial fitness function, genetic style, you'll be waiting for a long time.
  
  Reply View | 1 reply
  
  bob1029 3 months ago
  
  I agree if you approach it naively you will accomplish nothing.
  With some optimization, you can evolve programs with search spaces of 10^10000 states (i.e., 10 unique instructions, 10000 instructions long) and beyond.
  Visiting every possible combination is not the goal here.
  
  Reply View | 0 replies