Also update references to the loss functions defined in adaptive.
Joseph Weston authored on 19/02/2018 18:39:18... | ... |
@@ -448,22 +448,22 @@ |
448 | 448 |
"cell_type": "markdown", |
449 | 449 |
"metadata": {}, |
450 | 450 |
"source": [ |
451 |
- "# Custom point choosing logic for 1D and 2D" |
|
451 |
+ "# Custom adaptive logic for 1D and 2D" |
|
452 | 452 |
] |
453 | 453 |
}, |
454 | 454 |
{ |
455 | 455 |
"cell_type": "markdown", |
456 | 456 |
"metadata": {}, |
457 | 457 |
"source": [ |
458 |
- "The `Learner1D` and `Learner2D` implement a certain logic for chosing points based on the existing data.\n", |
|
458 |
+ "`Learner1D` and `Learner2D` both work on the principle of subdividing their domain into subdomains, and assigning a property to each subdomain, which we call the *loss*. The algorithm for choosing the best place to evaluate our function is then simply *take the subdomain with the largest loss and add a point in the center, creating new subdomains around this point*. \n", |
|
459 | 459 |
"\n", |
460 |
- "For some functions this default stratagy might not work, for example you'll run into trouble when you learn functions that contain divergencies.\n", |
|
460 |
+ "The *loss function* that defines the loss per subdomain is the canonical place to define what regions of the domain are \"interesting\".\n", |
|
461 |
+ "The default loss function for `Learner1D` and `Learner2D` is sufficient for a wide range of common cases, but it is by no means a panacea. For example, the default loss function will tend to get stuck on divergences.\n", |
|
461 | 462 |
"\n", |
462 |
- "Both the `Learner1D` and `Learner2D` allow you to use a custom loss function, which you specify as an argument in the learner. See the doc-string of `Learner1D` and `Learner2D` to see what `loss_per_interval` and `loss_per_triangle` need to return and take as input.\n", |
|
463 |
+ "Both the `Learner1D` and `Learner2D` allow you to specify a *custom loss function*. Below we illustrate how you would go about writing your own loss function. The documentation for `Learner1D` and `Learner2D` specifies the signature that your loss function needs to have in order for it to work with `adaptive`.\n", |
|
463 | 464 |
"\n", |
464 |
- "As an example we implement a homogeneous sampling strategy (which of course is not the best way of handling divergencies).\n", |
|
465 | 465 |
"\n", |
466 |
- "Note that both these loss functions are also available from `adaptive.learner.learner1d.uniform_sampling` and `adaptive.learner.learner2d.uniform_sampling`." |
|
466 |
+ "Say we want to properly sample a function that contains divergences. A simple (but naive) strategy is to *uniformly* sample the domain:\n" |
|
467 | 467 |
] |
468 | 468 |
}, |
469 | 469 |
{ |
... | ... |
@@ -473,6 +473,7 @@ |
473 | 473 |
"outputs": [], |
474 | 474 |
"source": [ |
475 | 475 |
"def uniform_sampling_1d(interval, scale, function_values):\n", |
476 |
+ " # Note that we never use 'function_values'; the loss is just the size of the subdomain\n", |
|
476 | 477 |
" x_left, x_right = interval\n", |
477 | 478 |
" x_scale, _ = scale\n", |
478 | 479 |
" dx = (x_right - x_left) / x_scale\n", |
... | ... |
@@ -492,7 +493,9 @@ |
492 | 493 |
"metadata": {}, |
493 | 494 |
"outputs": [], |
494 | 495 |
"source": [ |
495 |
- "%%opts EdgePaths (color='w')\n", |
|
496 |
+ "%%opts EdgePaths (color='w') Image [logz=True]\n", |
|
497 |
+ "\n", |
|
498 |
+ "from adaptive.runner import SequentialExecutor\n", |
|
496 | 499 |
"\n", |
497 | 500 |
"def uniform_sampling_2d(ip):\n", |
498 | 501 |
" from adaptive.learner.learner2D import areas\n", |
... | ... |
@@ -504,18 +507,32 @@ |
504 | 507 |
" return 1 / (x**2 + y**2)\n", |
505 | 508 |
"\n", |
506 | 509 |
"learner = adaptive.Learner2D(f_divergent_2d, [(-1, 1), (-1, 1)], loss_per_triangle=uniform_sampling_2d)\n", |
507 |
- "runner = adaptive.BlockingRunner(learner, goal=lambda l: l.loss() < 0.02)\n", |
|
508 |
- "learner.plot(tri_alpha=0.3)" |
|
510 |
+ "\n", |
|
511 |
+ "# this takes a while, so use the async Runner so we know *something* is happening\n", |
|
512 |
+ "runner = adaptive.Runner(learner, goal=lambda l: l.loss() < 0.02)\n", |
|
513 |
+ "runner.live_info()\n", |
|
514 |
+ "runner.live_plot(update_interval=0.2,\n", |
|
515 |
+ " plotter=lambda l: l.plot(tri_alpha=0.3).relabel('1 / (x^2 + y^2) in log scale'))" |
|
516 |
+ ] |
|
517 |
+ }, |
|
518 |
+ { |
|
519 |
+ "cell_type": "markdown", |
|
520 |
+ "metadata": {}, |
|
521 |
+ "source": [ |
|
522 |
+ "The uniform sampling strategy is a common case to benchmark against, so the 1D and 2D versions are included in `adaptive` as `adaptive.learner.learner1D.uniform_sampling` and `adaptive.learner.learner2D.uniform_sampling`." |
|
509 | 523 |
] |
510 | 524 |
}, |
511 | 525 |
{ |
512 | 526 |
"cell_type": "markdown", |
513 | 527 |
"metadata": {}, |
514 | 528 |
"source": [ |
515 |
- "#### Doing better\n", |
|
516 |
- "Of course we can improve on the the above result, since just homogeneous sampling is usually the dumbest way to sample.\n", |
|
529 |
+ "### Doing better\n", |
|
530 |
+ "\n", |
|
531 |
+ "Of course, using `adaptive` for uniform sampling is a bit of a waste!\n", |
|
532 |
+ "\n", |
|
533 |
+ "Let's see if we can do a bit better. Below we define a loss per subdomain that scales with the degree of nonlinearity of the function (this is very similar to the default loss function for `Learner2D`), but which is 0 for subdomains smaller than a certain area, and infinite for subdomains larger than a certain area.\n", |
|
517 | 534 |
"\n", |
518 |
- "The loss function (slightly more general version) below is available as `adaptive.learner.learner2D.resolution_loss`." |
|
535 |
+ "A loss defined in this way means that the adaptive algorithm will first prioritise subdomains that are too large (infinite loss). After all subdomains are appropriately small it will prioritise places where the function is very nonlinear, but will ignore subdomains that are too small (0 loss)." |
|
519 | 536 |
] |
520 | 537 |
}, |
521 | 538 |
{ |
... | ... |
@@ -529,13 +546,17 @@ |
529 | 546 |
"def resolution_loss(ip, min_distance=0, max_distance=1):\n", |
530 | 547 |
" \"\"\"min_distance and max_distance should be in between 0 and 1\n", |
531 | 548 |
" because the total area is normalized to 1.\"\"\"\n", |
549 |
+ "\n", |
|
532 | 550 |
" from adaptive.learner.learner2D import areas, deviations\n", |
551 |
+ "\n", |
|
533 | 552 |
" A = areas(ip)\n", |
534 | 553 |
"\n", |
535 |
- " # `deviations` returns an array of the same length as the\n", |
|
536 |
- " # vector your function to be learned returns, so 1 in this case.\n", |
|
537 |
- " # Its value represents the deviation from the linear estimate based\n", |
|
538 |
- " # on the gradients inside each triangle.\n", |
|
554 |
+ " # 'deviations' returns an array of shape '(n, len(ip))', where\n", |
|
555 |
+ " # 'n' is the is the dimension of the output of the learned function\n", |
|
556 |
+ " # In this case we know that the learned function returns a scalar,\n", |
|
557 |
+ " # so 'deviations' returns an array of shape '(1, len(ip))'.\n", |
|
558 |
+ " # It represents the deviation of the function value from a linear estimate\n", |
|
559 |
+ " # over each triangular subdomain.\n", |
|
539 | 560 |
" dev = deviations(ip)[0]\n", |
540 | 561 |
" \n", |
541 | 562 |
" # we add terms of the same dimension: dev == [distance], A == [distance**2]\n", |
... | ... |
@@ -556,6 +577,15 @@ |
556 | 577 |
"learner.plot(tri_alpha=0.3).relabel('1 / (x^2 + y^2) in log scale')" |
557 | 578 |
] |
558 | 579 |
}, |
580 |
+ { |
|
581 |
+ "cell_type": "markdown", |
|
582 |
+ "metadata": {}, |
|
583 |
+ "source": [ |
|
584 |
+ "Awesome! We zoom in on the singularity, but not at the expense of sampling the rest of the domain a reasonable amount.\n", |
|
585 |
+ "\n", |
|
586 |
+ "The above strategy is available as `adaptive.learner.learner2D.resolution_loss`." |
|
587 |
+ ] |
|
588 |
+ }, |
|
559 | 589 |
{ |
560 | 590 |
"cell_type": "markdown", |
561 | 591 |
"metadata": {}, |