# PODNAME: Neo4j::Driver::TODO # ABSTRACT: Information on planned improvements to Neo4j::Driver =encoding utf8 =head1 TODO =head2 Address open issues on GitHub See L. =head2 Functionality and API =over =item * Implement spatial and temporal types. =item * Add timers to L (see C). =item * C error objects (e. g. L) instead of strings. It seems there are about four types that would need to be distinguished: illegal usage errors, internal driver errors, Network errors, and Neo4j server errors. See also L<#7|https://github.com/johannessen/neo4j-driver-perl/issues/7>. =back =head2 Experimental features =over =item * L: make stable and move filter implementation to C in preparation of Bolt v3 support =item * L: make illegal, but create a new config option that continues to provide C functionality (same flexibility, simpler API) =item * L: make illegal =item * L: make illegal for both nested explicit and nested autocommit transactions (for consistency with Bolt), but provide a driver config option (e. g. C<< nested_transactions => 1 >>) that lifts this restriction for HTTP =item * L: make illegal =item * L: This feature should no longer be exposed to the client. It complicates the API significantly and is not that big of an optimisation anyway, because results are typically fetched I the next statement is run. =item * L, L: make driver config option =item * Calling list methods in scalar context (L, L, L, and L) should probably generally behave the same way using a list in scalar context would: It should return the number of items in the list. =back =head2 Tests, code quality, documentation =over =item * Test roundtrip of special numeric values (very large integers, -0.0, ±Inf, ±NaN). =item * Convert tests in C to use L, so that testing/dev dependencies are fewer. =item * Improve test coverage: =over =item * Many yet uncovered code paths are obviously fine, but difficult or impossible to cover. In some of these cases, it may be possible to refactor the code, such as by banking on autovivification (i.e. don't defend against undefined C<$a> in expressions like C<< $a->{b} >>; see L). =item * The C subs contain a lot of assertions and other checks that are not normally necessary. Since this logic seems to work fine, it should be simplified. (This may also make it easier to unroll the recursion later.) =item * The current policy of not documenting deprecated methods is informed by the principle to "design interfaces that are: consistent; easy to use correctly; hard to use incorrectly". Perhaps simply listing the deprecated method names with short note like "deprecated in 0.13" would be an acceptable addition that also fulfils the B coverage requirements. =item * Documenting each attribute in L as individual methods might be a quick way to bring up the B coverage stats a bit. =back =item * Neo4j::Test should auto-detect the Neo4j server version and set the C config option accordingly. =item * Neo4j::Sim should implement the driver HTTP net module API instead of the L API, so that we can eventually replace REST::Client. =item * Write new unit tests for all modules. =item * Optimise the simulator for C<$hash = 0>. Use of C<<< << >>> causes statements to end with C<\n>, which the simulator could filter out. The internals test "transaction: REST 404 error handling" should run a distinct statement. =item * Try to change "no connection" tests such that no C responses are required by (ab)using a particular port number on localhost; the port could be checked beforehand and the test skipped if it happens to be open for some reason. Alternatively, the tests in question could be changed to a simulated connection failure. =item * Verify that L and L really do return undef (as documented), even when called in list context. =item * Add test for byte arrays (C). =item * Check that nodes without labels and nodes without properties are handled correctly in all aspects (there I be issues with perlbolt). =item * Check that after a server error, the next statement will succeed (there I be issues with perlbolt; see L). =item * List possible C output in L, allowing for indexing by search engines. =item * L: Clarify docs that this method is to be called in list context. =back =head1 Other ideas for specific modules =head2 L =over =item * Make the URL a config option, so that it can be queried (and changed). =item * Make the auth data a config option, so that it can be queried. As alternative ways to I the auth data, C should continue to be supported as an alias and the user info should be parsed from the URL if given (however, URLs without user info should not change the auth data stored in the driver). A possible implementation would be to create a new L module that would offer suitable methods, but this seems like overkill. That said, userinfo is actually forbidden now by L, so this should perhaps not be added after all? =item * Allow passing config options directly to the constructor, e. g. in place of the URL (C<< Neo4j::Driver->new({ url=>"bolt:", timeout=>30 }) >>). =item * Change the default URI scheme from HTTP to auto-detect, i. e. try Bolt first, then HTTP in case of failure. This could be explicitly specified as e. g. C. =item * The C scheme could perhaps be mapped onto Bolt or onto the default URL scheme, just so that C URLs will kind of work. =item * Consider writing a concrete example that re-creates LOMS logic by re-blessing the structural types into custom types from the business logic using C<< $cypher_types->{init} >>. (For example, check for Neo4j nodes that are labelled C<:Person> and re-bless those as C or whatever.) =back =head2 L =over =item * Once a session is created, the driver object becomes immutable. It should therefore be possible to store the ServerInfo in the driver object once it is obtained. If the default database is added as well, the Discovery API doesn't need to be used again for a new session. This change would keep down network utilisation in scenarios where many sessions are created (such as running the driver's test suite). =item * Consider whether to offer L. If available, these should consist of subrefs passed to methods called C and C. These access modes are only an optimisation for Enterprise features. We don't target those at present, but C could then eventually be routed to a high-performance read-only server once clusters are supported. It would make sense to offer both methods right away even though initially they'd work exactly the same. =back =head2 L =over =item * Consider supporting re-using C objects for query parameters in C. The Java and C# drivers do this. =item * Run statements lazily: Just like with the official drivers, statements passed to C should be gathered until their results are actually accessed. Then, and only then, all statements gathered so far should be sent to the server using a single request. Challenges of this approach include that notifications are not associated with a single statement, so there must be an option to disable this behaviour; indeed, disabled should probably be the default when stats are requested. Additionally, there are some bugs with multiple statements (see tests C and C). Since stats are now requested by default, this item might mean investing time in developing an optimisation feature that is almost never used. Since the server is often run on localhost anyway where latency is very close to zero, this item should not have high priority. =item * The current API to L seems a bit too complicated, especially for something that is probably hardly ever actually used in practice. It might make sense to provide a dedicated method (e. g. C<_run_multiple()>) for just that purpose. This would eventually free up the C implementation from unneeded baggage. The dedicated method may be private as long as the plan still is to provide this functionality by running statements lazily. =back =head2 L =over =item * Consider whether to implement methods to query the list of fields for this record (C, C, C) and/or a mapping function for all fields (C/C). Given that this data should easily be available through the Result object, these seem slightly superfluous though. =item * Implement C; see L, L. =item * Add C as alias for C, enabling clients to avoid the possibly confusing C<< $record->get->get >> pattern. The official drivers only offer C, and C might be too similar to C in the official Java driver, so this alias should perhaps be experimental. =back =head2 L =over =item * Perhaps C should always buffer two records instead of just one. With the current implementation, the bolt connection might remain attached longer than desirable in cases where the client knows in advance how many records there will be and calls C exactly that number of times. (In theory, such a change might even slightly improve performance if the driver uses Perl threads to fill the buffer in the background.) =back =head2 L =over =item * The entire package can probably be removed now. =back =head2 L =over =item * Profile the server-side performance penalty of requesting stats for various kinds of queries. If the penalty turns out to be high, stats should perhaps have to be requested explicitly by clients (rather than being obtained by default, as with 0.13 and higher). However, using Bolt always provides stats, and different APIs for HTTP and Bolt seem like a bad idea. =back =head2 L =over =item * C =item * It seems Neo4j 4 added new counters for system updates. =back =head2 L =over =item * Rollback behaviour on errors needs further study. L says that all errors have a rollback effect, but in at least some cases, the effect seems to be merely to mark the tx as failed and uncommittable, which isn't quite the same thing. This may or may not vary across error types, Neo4j versions, or Bolt versions. OTOH, some errors are internal client errors that shouldn't rollback the tx (L). Not sure if these occur in practice, but we should probably be able to handle them correctly anyway. =back =head2 L =over =item * Consider unrolling C recursion. Based on initial profiling, this may save up to about 5% CPU time (for a specific HTTP test query cached in RAM, performance went from about 2700/s to 2850/s when skipping the call to C entirely). However, when accessing the database, the bottleneck is typically I/O (querying Neo4j itself instead of the RAM-cached response let the performance for the very same query drop down to 650/s when executed over HTTP). So this optimisation may not be worth it (OTOH, Bolt performance was something like 7000/s, so optimising C may be more useful there). =item * If a 201 is received without a C header, it is currently simply ignored by C<_parse_tx_status()>. (The simulator requires this.) According to RFC 7231, such a response means the location hasn't changed, i. e. the resource has been created at the default transaction endpoint. That should never happen; in fact, it should only ever happen for a PUT request, but we don't use those here. So ignoring this is probably the right choice. But it may still be useful to revisit this logic later on. =item * Parse the C date for transactions, sync client and server system dates using the C header and track transaction expiration. =back =head2 Neo4j::Driver::Net::HTTP::* =over =item * Profile whether C has any performance impact. Remove if in doubt, as we don't implement streaming (and have no plans to in the future, considering that Bolt should be used when speed is important). =item * Remove the L dependency and use directly. =back =head2 Neo4j::Driver::Type::* =over =item * The C and C operators should be overloaded to allow for ID comparison on nodes and relationships. =item * Consider whether to use C to allow direct access to e. g. properties using the pre-0.13 hashref syntax. See L for an example. L might perhaps also work. Note that L might be a memory hog; see L. =item * Try to refactor L's internal representation to allow either elements or nodes+rels. Have one autogenerate from the other, then cache the results. May not actually have advantages for deep_bless though. =item * Add C as alias for C in L and L, enabling clients to avoid the possibly confusing C<< $record->get->get >> pattern. =back